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Preface 
The Shape of the Book 


This is a book on the history of mathematics; its basic dynamic is 
historical and therefore, up to a point, chronological. It follows the 
progress of anumber of ideas that grew, sometimes came together, and 
often developed rich and fascinating branches and applications. At its 
core is an account of how the calculus of Newton and Leibniz—the 
calculus of functions of a single variable—led to attempts to develop a 
calculus of functions of several variable and how these new 
mathematical methods contributed to the study, first of ordinary, and 
then of partial differential equations. In each case, the rationale for that 
work was chiefly to develop general methods that could tackle 
problems in geometry and mechanics (the motions of solids and liquids 
under the action of forces). 

The physical world being a complicated place, most of the 
applications involved partial differential equations, and here the story 
soon also became complicated. The first-order partial differential 
equation in two independent variables was initially difficult to solve, 
and this posed problems for the study of more than two independent 
variables and for equations of higher order. Important work on the 
first-order case was done by Lagrange and Monge before Cauchy was 
finally able to show that such equations almost always have a solution. 
But the second-order case almost immediately confined itself to three 
special cases, somewhat as Euler had suggested, and all of them, as we 
would say, linear. The first, and simplest, is the wave equation (the 
prototype hyperbolic equation), successfully tackled by d'Alembert. 
Euler regarded the one later known as the elliptic case (the key 
example being the Laplace equation) as being beyond current methods. 
Finally, the case we call parabolic fell through a gap in his approach, and 
strangely little was said about it before Fourier dealt with the canonical 
example: the heat equation. At this point, a significant departure from 
the theory of ordinary differential equations opened up: the need to pay 
attention to initial or boundary conditions. However, this issue was to 
remain obscure for several decades. 


Euler quickly showed that linear ordinary differential equations 
with constant coefficients can be solved systematically. Other types of 
ordinary differential equations were studied in the eighteenth century, 
but the story is piecemeal, and instead, I chose to give just one example 
of the history of ordinary differential equations: the hypergeometric 
equation from Gauss to Riemann, Schwarz, and Poincaré. This is one of 
the glories of the subject, bringing together early ideas about group 
theory, complex function theory, and the then-novel hyperbolic or non- 
Euclidean geometry. 

So how is all this material organised in this book? Chapter 1 
connects the calculus to problems in ordinary differential equations 
and is mirrored by Chaps. 3 and 5 in which the calculus of several 
variables is developed and the first partial differential equations are 
studied. Then it is a fairly straight run through topics in partial 
differential equation theory in Chaps. 6, 8, 10, 13, 17-20. This allows us 
to see how the work of Euler, d'Alembert, and a few others rewrote 
Newton's Principia Mathematica for the eighteenth-twentieth 
centuries. The story of the hypergeometric equation occupies Chaps. 11 
and 14-16 because it must start in 1812 with Gauss and because once 
it gets going it seems ridiculous to break it up. What intervenes here is 
Chap. 12 on Cauchy’s demonstration of the existence of solutions to 
ordinary differential equations and Chap. 13 on Riemann’s geometric 
version of complex function theory, which is needed for the subsequent 
three chapters. 

What of the chapters not yet referred to? Chapter 2 describes the 
start of the calculus of variations, and Chap. 7 takes that subject further 
into the eighteenth century. Chapter 4 documents other successes of 
the partial differential calculus in studying natural phenomena other 
than the wave equation. (There is also a surprising link to the 
hypergeometric equation.) Chapters 9 and 21 are opportunities for 
revision; when I gave the course I used these lectures to discuss the 
assessment on the course so far. 

The remaining chapters move into what may be less familiar 
material. Riemann’s study of shock waves; Riemann and Weierstrass on 
minimal surfaces; the work of Thomson and Stokes on the 
telegraphist’s equation and the laying of the trans-Atlantic cable; a look 
at the first ninieteenth-century attempts to rigorise the calculus of 


variations; the eventual introduction of the fundamental trichotomy 
(elliptic, parabolic, hyperbolic) for second-order linear partial 
differential equations and the first general existence theorems in the 
elliptic and hyperbolic cases including Hadamard’s insistence of the 
distinction between initial and boundary value problems. Two chapters 
look at how Jacobi used Hamilton’s ideas to create Hamilton-Jacobi 
theory and subsequent attempts to geometrise mechanics, and the 
connection to the solution of first-order partial differential equations. 

All this material has a certain coherence that is worth spelling out. 
Ordinary differential equations grew out of, or alongside, problems in 
evaluating integrals, which is why we still talk, confusingly, of 
integrating a differential equation and its solutions as its integrals. It 
was soon recognised that the solution to an ordinary differential 
equation was a family of functions and an individual solution could be 
specified by means of some initial conditions. So, it was natural when 
differential equations with several independent variables were 
investigated that the earliest researchers (Jean le Rond d’‘Alembert, 
Leonhard Euler, Pierre Simon Laplace, and Joseph-Louis Lagrange) 
thought of these partial differential equations in the same way, and 
looked for techniques that would produce a formula for the general 
solution (however, they seldom also discussed an auxiliary process of 
fitting the general solution to some initial conditions). Part of the story 
here is the gradual recognition that this is not the right way to think of 
partial differential equations. Rather, it is a dialogue between the 
general methods and the initial or boundary conditions that is central, 
and which underpins the crucial distinction between the elliptic and 
hyperbolic types to which formal methods are blind. As we shall see, 
this explains the problematic way in which complex variables were first 
used. 

It is also interesting to see how questions of rigour enter the story 
in a way that is immediately important and does not appear as the 
whim of an analytic pedant. The ad hoc methods that can be used to 
solve a partial differential equation (such as the separation of variables) 
naturally raise the question of the uniqueness of the solutions that is 
important in applications. The need for power series to converge forces 
a heavy reliance on (real or complex) analytic methods that has, 
ultimately, to be outflanked. 


Advice to Students 


There are several important things being described in this course, and 
it may help to remember what they are before immersing yourself in 
the technical details. Newton’s work is a case in point. Even though we 
shall only skim its surface, it is clear that this is a remarkable 
achievement, one that it took more than a century to confirm, and some 
of the best work of the twentieth century to surpass. Newton’s account 
of the motion of the Moon and the planets does not rest on the calculus, 
still less differential equations, but everyone after him turned to the 
calculus, and Euler gave everyone the means to write celestial 
mechanics that way ever after. Mathematical accounts of fundamental 
physical processes—gravity, the production and spread of sound, the 
propagation of heat and of electric signals—are among the successes of 
the theory of partial differential equations. 

But for the calculus to do this work, mathematicians have to have 
the confidence that it does work. This is partly a matter of rigour, and 
indeed it is satisfying to see that so many of the questions that are dealt 
with in courses on pure analysis arose in contexts where a practical, or 
at least a physical, answer depended on the quality of the reasoning. 
Less obviously, but perhaps more interestingly, it is worth seeing what 
even the best mathematicians did with difficult problems. Arguments 
have to be rigorous—ultimately. Before then they have to be some or all 
of convincing, intelligible, general, plausible, and applicable. Likewise, 
solutions have to be a number of things apart, ideally, from being right: 
among the criteria are, on one occasion or another, computable, 
accurate enough, intelligible, complete, and unique. Just as an argument 
might get to the heart of a problem or somehow merely work, a 
solution can be truly informative or merely a formula. All of this is on 
display here. And, of course, sometimes an equation with no answer in 
sight can still seem to be a valuable advance. 

In this sense, the most momentous event on display here in the first 
half of the course is that the calculus, in the form of differential 
equations, both ordinary and partial, can deliver so much insight. The 
comparable change in the second half of the course, as I hinted above, is 
the transformation in what a solution is taken to be. For partial 
differential equations this is the rise to equal importance with the 


equation of the boundary or initial conditions, coupled as itis witha 
profound classification of these equations into types. The need for 
rigour played its part in these developments when appeals to the so- 
called generality of analysis and its supposed algebraic or formal basis 
began to fail. 

This is not a set of lectures in which epsilons and deltas, ns and Ns 


dance ever more intricately, but this should not suggest that when 
mathematics is applied—whatever that might mean—standards drop. 
Mathematicians were doing their best at all times to get it right, 
although we can observe different ways in which they honoured that 
commandment. The difficult mathematics here comes from the 
difficulty of the problems: a partial differential equation is a difficult 
thing to understand, harder than an ordinary differential equation, and 
harder than many an early investigator realised. Qualitative arguments 
are often harder than quantitative ones, if less technical. 

The challenge you face is to get a sense of that struggle, of the 
difficulty, and how it was tackled. 

Being a historian of mathematics means attending to mathematics 
on its own terms as well as ours, and seeing it in the context of its time. 
What was known, what was thought to be true? When a mathematician 
tackles a problem, you ask: How had other problems like this one been 
tackled, how were they tackled after this one? Is the analysis of the 
problem convincing, is the solution informative? What, in the end, were 
these people trying to do? 


Historiographical Remarks 


There are several existing accounts of the history of calculus, anda 
number of specialist books and articles on particular aspects of that 
history. The contributions of Newton and Leibniz, Euler, Lagrange, 
Fourier, Cauchy, Riemann, Weierstrass, Poincaré, and Hadamard have 
been studied in some depth; various topics, such as the wave equation, 
the heat equation, Laplace’s equation, and the Dirichlet problem have 
been looked at in some detail, although not always after the original 
breakthroughs were made. But there is no general history of 
differential equations, ordinary or partial. Histories of the calculus 
dwell on the story of the rigorisation of the calculus and the creation of 


modern (or, rather, nineteenth century) mathematical analysis, but tend 
to marginalise the story of what made the calculus valuable: the 
capacity it gives mathematicians and scientists to formulate and solve 
problems across the fields of physics and geometry. Historians have 
tended to forget that what made the calculus worth all the efforts to 
understand it was not ideas about infinitesimals, differentials, limits, 
and the like that were introduced to explain and justify it, but its many 
successes in providing an understanding of the natural world, from the 
motion of the planets to the transmission of electric signals, and in 
extending the powers of geometry. 

There have been a few notable departures from this scholarly 
regime in recent years. Craig Fraser and Jesper Lutzen have steadily 
enriched our understanding, and other historians (Tom Archibald, June 
Barrow-Green, Umberto Bottazzini, Christian Gilain, and Tom Hawkins, 
among them) have dealt with various aspects of the development of the 
theory of differential equations as it entered into the larger pictures 
they were exploring. 


What This Book is Not 


The largest omission is the work of Maxwell and the equations named 
after him, but it seemed to me that the modern theory, and the physical 
experiments that it explains, are largely unknown to mathematics 
students and would have entailed too great a detour to bring to life. In 
addition, Maxwell’s ideas about the physics involved are not the 
modern ones—and famously, no Continental physicist claimed to 
understand them—so it would have been impossible to do them justice 
in the space available. Disappointed readers should consult Buchwald’s 
From Maxwell to Microphysics. For much the same reasons, I was unable 
to deal with hydrodynamics and the Navier-Stokes equations, but 
readers may always turn to Darrigol’s Worlds of Flow. 

Another topic that is wholly missing is the use of perturbative 
methods. Most differential equations, and systems of such equations, 
that arose in practice could only be tackled by the method of 
undetermined coefficients. The idea was to start from a simplified 
version of the problem at hand that, however, admitted an exact 
solution, and to seek the solution to the solution to the actual problem 
by adding more terms to cope with the increased complexity. These 
might take the form of power series, or later trigonometric series, 
which were fitted to what data there was, especially in the important 
subject of astronomy, and their coefficients adjusted to refine 
predictions and explain other effects. 

Another important topic that it would be good to have included is 
Sturm-Liouville theory, but there is already an excellent historical 
account in Lutzen’s Joseph Liouville (1809-1882): Master of Pure and 
Applied Mathematics, and | thought it better to add to the stock of 
historical information about the development of differential equations. 

I would very much have liked to have concluded the course with 
Poincaré’s ideas about flows on surfaces, and his brilliant extension of 
these techniques in his famous memoir on the three-body problem, 
which would have made an attractive connection back to Newton’s 
Principia Mathematica, but there simply was no room. However, there 
are existing accounts of this subject.' 

And, I admit, the wish to try to say something about the history of 
partial differential equations, surely the largest omission in the history 


of modern mathematics, also played a part in my decisions about what 
to include. 

So, dear reader, if your favourite topic is not here, and especially if 
there is not a good modern history of it, then the opportunity is there 
for you to write it and in that way fill a gap in the literature. There are 
short overviews of the historical development of differential equation 
(for example in (Kline 1972)) and there are detailed treatments of 
selected topics, as I have tried to acknowledge and benefit from. There 
is ongoing work by a number of historians of mathematics, but the fact 
remains that the history of mathematics is tied closely to the history of 
pure mathematics through a shared interest in foundations, and the 
history of classical applied and applicable mathematics lags behind. 
This would be merely unfortunate, were it not for the fact that it is 
through differential equations that the calculus largely justified its 
existence—geometry being another, but smaller, vital field. 

Without histories of differential equations, we lack a significant part 
of the history of mathematics. We cannot properly explain to our 
students where we are coming from and how we got here, we cannot 
explain the significance of mathematics to historians of science, and we 
are hindered in our attempts to rescue philosophical accounts of 
mathematics from the grip of foundationalists, who see only set theory 
and logic. 

There are considerable losses. We are likely to leap from Newton, 
Leibniz, and the invention of the calculus straight to Cauchy, with 
perhaps a glance at Lagrange’s unsuccessful earlier attempt at 
rigorising the calculus. In this way, the entire eighteenth century is 
largely forgotten and is only dealt with in fragments. The study of 
partial differential equations is reduced to what I jokingly refer to as 
solving the only four partial differential equations that exist: the 
general first-order partial differential equation, Laplace’s equation, the 
heat equation, and the wave equation. 

This book is an attempt to fill in some of the gaps. 


Sources and Their Uses 


There is inevitably an absence of material in English on this material. 
Newton, of course, has been generously put into English when he did 


not write it himself, J. M. Child translated some of Leibniz’s 
considerable and mostly unpublished writings of relevance in (Child 
1920), and for the eighteenth century, there is the remarkable and 
growing resource of the Euler Archive, where almost all of the original 
work of Euler can be found along with many substantial translations. 
Among the nineteenth-century mathematicians, almost all of Riemann’s 
work is now in English, as are Hilbert’s remarks in his Paris address on 
Mathematical Problems; and Hamilton and Green naturally wrote in 
English. The rest remains in Latin, French, and German, and a richer 
study would embrace Italian and Russian. 

Source books have done something to ease the students’ paths: 
Struik’s on the period 1200-1800 and Birkhoff’s rather freer 
translations of nineteenth-century work are very helpful, and more can 
be found in the book by Fauvel and Gray (referred to here as F&G). 
Historians’ translations of shorter extracts can also be found in their 
papers. 

I have therefore added to the collection of works translated into 
English some items from Cauchy (on the existence of solutions to first- 
order partial differential equations) Darboux (on the telegraphist’s 
equation), Schwarz (on analytic maps from a half-plane or disc to a 
polygon, his alternating method, and part of his paper on the 
hypergeometric equation), and a passage from the introduction Picard 
wrote to one of his papers. 

As for illustrations, I had originally planned to include pictures of 
most of the important mathematicians whose work is discussed in this 
book, but copyright issues posed obstacles in a number of cases. 
However, these days a great many pictures, very often accurately 
identified, are available on the internet. 


Advice to Instructors 


This book is the fourth and last of my books based on courses | taught, 
each over a period of four years, at the University of Warwick. Together, 
they cover the emergence of a fair amount of mathematics in the 
standard syllabus at many universities today. They have been published 
at a time when the prospects for courses in the history of modern 
mathematics in Britain have become poor, I believe for two reasons. 


First, in Britain as in many places, there seem to be few prospects for 
anyone wanting to shape a career as a historian of mathematics; 
students know this, and they seldom entertain the idea at the graduate 
level.* Second, there are problems for anyone wanting to run a course 
in the subject: problems of language, problems of sources, problems 
with assessment. These four books are offered partly as a way around 
the second problem, and that accounts for their content, specifically the 
three chapters on assessment. I wanted to show that there are ways to 
assess student’s grasp of the history of mathematics that are not simply 
exercises in old mathematics, and the result is the adaptation of what 
my Open University colleagues and I did to more advanced topics. 

There are many reasons why a course in the history of mathematics 
at university can benefit students. It humanises the subject, 
demonstrates the intent behind many discoveries, and helps to explain 
why we have the mathematics we do. It always seemed to me that any 
history of mathematics course best belonged in the students’ final year, 
when they already know enough mathematics for the history to get a 
proper look in. In a world in which few students go on to do research in 
straight mathematics, but many go on to be mathematicians in a huge 
variety of environments, I believe that a historical overview of part of 
the subject offers at least as much value as any other specialism. 

Of course, I do not claim that any of the four volumes is the course 
to adopt. It might well make sense to use any of the books selectively. 
This one might yield a course on partial differential equations, ora 
short course on the hypergeometric equation, for example. It will 
depend on the audience. And I would cheerfully admit that almost 
every chapter here is too long to be a lecture; indeed I never taught it 
that way. Each chapter is a resource, and in the absence of other 
material for the student to read, I thought it best to provide enough for 
readers to engage with. There is more than enough for three lectures a 
week here, but not too much for a week’s study. 
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31) 


28. The general second-order hyperbolic equation (Hadamard Chaps. 
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1.1 Introduction 


The chapter is in two parts. In the first, we look briefly at the discovery 
of these methods for finding tangents, which is, of course, part of the 
seventeenth-century discovery of the methods of calculus. The first 
inverse tangent problem—the precursor of differential equations—was 
asked early on, in 1638, and it is interesting to see that neither 
Descartes nor Leibniz could properly solve it. 

In the second part, we see how the calculus, in Euler’s hands, led to 
the development of methods for solving various kinds of ordinary 
differential equation, including Debeaune’s. The work of Euler and 
Bernoulli on vibrating rod and hanging chain led to a breakthrough in 
the study of linear differential equations and the introduction of the 
idea of a basis of solutions. Even more importantly, Euler was able to 
adapt the methods of the calculus to the study of mechanics, and so was 
able to express Newton’s laws of motion for the first time as differential 
equations. 


1.2 Origins: Inverse Tangent Problems 


In the 1620s and 1630s, various mathematicians—Pierre Fermat, René 
Descartes, and Gilles Personne de Roberval among them—began to 
develop methods for finding tangents to curves, either at a given point 
on the curve or from an arbitrary point and to the curve. With the 
success of these methods, it became possible to think of raising and 
answering the opposite question, that of finding a curve given some 
properties of its tangents. 

The person with the honour of having formulated the first inverse 
tangent problem is Florimond Debeaune. Debeaune was a wealthy 
member of the nobility in his hometown of Blois, where he was born in 
1601 and where he became a counselor at the Court of Justice. He also 
had a reputation as a high-quality lens grinder, and in 1639, Descartes 
wrote to him to ask him to design a machine that would make 
hyperbolic lenses. The project failed, but they remained in touch, and 
Debeaune went on to write the Notes briéves that were published in 
1649 in the first Latin edition of Descartes’s La Geometrie. In this work, 
he showed that the equations y* = xy + bx, y*? = —dy + bx, and 


y* = bx — x? represent a hyperbola, a parabola, and an ellipse, 


respectively. 

Debeaune was led to propose it in 1638 as a result of his study of 
Descartes’s La géométrie—so soon did Descartes’s ideas begin to 
transform geometry.! He raised it out of his interest in explaining 
mathematically why a plucked string vibrates as it does, specifically, in 
explaining why the frequency with which the string vibrates is 
independent of the force with which it is struck. It was one of four 
problems that he presented to the mathematical community, and it has 
come down to us in the form ofa letter to Roberval. 


1.2.1 Debeaune’s Problem 
Debeaune stated the problem this way.° 


Fig. 1.1 Debeaune’s problem 


Let there be a curve AXE whose vertex is A, axis AYZ, and the 
property of this curve is that, having taken any point on it you 
wish, say X, from which the line XY is drawn as a perpendicular 
ordinate to the axis, and having taken the tangent GXN through 
the same point X, and extended the perpendicular XZ to it at X 
until it meets the axis, there will be the same ratio of ZY to YX as 
a given line, like AB, has to the line YX — AY (Fig. 1.1). 


Draw the axis, AYZ, a curve, AXE, a tangent, GXN, at some point, X, on the 
curve, and erect the perpendicular, XY, as shown. Locate the line 
segments ZY, YX, and AY, and the line segment AB that provides a unit of 
length. Debeaune’s problem asks to find the curve, AXE, with the 
property of its tangents that 


ZY _— AB 
YX YX-AY~ 
Debeaune probably expected an answer in the form of a recipe for 


constructing points on the curve geometrically, rather than as an 
equation in some system of coordinates, but in any case, he was to be 


unlucky. Roberval showed that line through B drawn at 45° to the axis 


is an asymptote to the curve, and in October 1638, Descartes gave a 
mechanical description of how the curve might be drawn 
approximately, which was sufficient to confirm Roberval’s result, but no 
one could answer the challenge before Debeaune died in 1652. 

It has proved the characteristic of inverse tangent problems— 
differential equations as they became called—that they can be easy to 
state but very difficult to solve. One merit of the calculus was to be that 
it not only provided a way of stating inverse tangent problems but it 
also provided a set of rules for manipulating the problem symbolically 
until it could (quite often) be solved, at least in the sense that the 
solution curve could be described via equations or formulas. 


1.2.2 Other Inverse Tangent Problems 


Inverse tangent problems often arose naturally in the contemporary 
study of physical and astronomical problems. 


y 


Fig. 1.2. The tractrix 


For example, in the 1670s, Claude Perrault, who is best remembered 
as the architect of the east wing of the Louvre Palace in Paris, asked for 
the curve traced by a heavy weight drawn behind someone walking 


along a straight line. The solution is a curve called the tractrix (see 

Fig. 1.2) and had previously been considered by both Newton and 
Leibniz, although they did not identify it as a solution to this problem, 
and later by Huygens. More formally, it is the curve with the property 
that the length of the tangent from a point on the curve to a fixed line is 
a constant. 

In his Principia Mathematica [206], Newton investigated the paths 
of particles that moved subject to forces directed at a central point. 
Later, the paths of particles moving under gravity and encountering 
various forms of air resistance were studied by Newton and others; 
these too arose as the answers to inverse tangent problems. In these 
cases, itis the instantaneous direction of acceleration that is known, 
not the instantaneous velocity, so the problem is not strictly an inverse 
tangent problem but a generalisation. 

In 1696, Johann Bernoulli challenged the mathematical community 
to find the curve along which a sliding bead would descend most 
quickly between two given points—as we shall see in more detail in 
Chap. 2 the answer is a cycloid with a vertical tangent at the starting 
point (a cycloid is the curve traced by a point on the rim of a wheel 
rolling along a straight line). Five mathematicians responded, among 
them Newton, who stayed up all night and answered the question the 
day he received it; Bernoulli said he recognised the solution as 
Newton’s as one recognises the lion by his paw (it had taken Bernoulli 2 
weeks). 

As these dates indicate, these problems were very difficult, and the 
calculus enabled some progress to be made on a broad front. 
Unsurprisingly, it was not always quite as simple as that. 


1.3 From Inverse Tangent Problems to 


Differential Equations 


In the 1690s and early 1700s, Johann Bernoulli became the leading 
exponent of the calculus, which he and his older brother had learned by 
corresponding with Leibniz and reading his published papers. He went 
to Paris in 1691 and got himself hired to teach the Marquis de |’H6pital 


the new calculus, and as a result, the first book on the differential 
calculus appears with de I’H6pital as the author.’ 

From the book, we can see that Bernoulli’s definition of integration 
is interesting, for he defined it as Newton had done as the inverse of 
differentiation and not, as Leibniz did, as an infinite sum, and he gave 
several methods for finding areas. Then Bernoulli turned to inverse 
tangent problems and solved a variety of examples. The concluding, and 
arguably most important, part of the book was an exposition of how 
problems in geometry or mechanics can be translated into the language 
of calculus. Calculus was still very new, and showing how to express 
problems using it could be the hardest part of a mathematician’s work. 

In summary, translation procedure would go as follows. 


Set up a system of x and y coordinates with respect to which the 
solution to the problem can be expressed as a curve, and then 
formulate the problem in terms of equations involving these 
coordinate variables. 


Interpret the problem as a statement about the relationship 
between neighbouring points on the curve; this will take the form 
of an equation involving differentials. 


Pass back from the differentials to the finite quantities (x and y) 
and so determine the precise form of the equation that describes 
the solution curve. 


This method was not invented by Bernoulli. Leibniz had already 
tackled Debeaune’s problem this way, however clumsily—Leibniz was 
prone to careless errors. Bernoulli raised his approach to the level of a 
systematic method for tackling many such problems, and this moment 
of transition is marked by a change of name. Henceforth, inverse 
tangent problems became called differential equations, because in step 
(2) they were literally expressed as equations involving differentials. 
That the new name stuck shows how closely the new methods of the 
calculus became associated with problems involving instantaneous 
change or changes from point to point along a curve.” 

None of this would be of any use if the resulting differential 
equation could not be solved. Now, to Bernoulli and his 


contemporaries, a solution ideally meant a geometrical description of 
the required curve. This is a global description of a curve, such as is 
usually given for a circle, a conic section, or a few other curves such as 
the cycloid. The calculus, however, did not always lend itself to 
providing such a thing. When stage (3) has been carried out 
successfully, the solution is expressed as an equation in coordinates x 
and y that defines a curve (depending on some initial conditions). But 
to mathematicians of the late seventeenth century, a further step was 
required in which the curve was characterised by some property by 
which it could be recognised, much as we today routinely gloss a curve 
defined by a quadratic equation as a particular sort of conic section. 
This amounts to reversing stage (1) and is much harder than traversing 
it, and since that was often hard enough, going backwards often proved 
to be too difficult. Indeed, why should the solution curve be any kind of 
known curve? However, if this step is not taken, the curve can at best be 
drawn pointwise, and important properties of it might remain 
undetected.° Gradually, mathematicians began to accept equations as 
the solution and not to look beyond them; and the more they did so the 
more mathematics became more formal and algebraic, and less 
geometrical in nature. 

The catenary problem is a good example. This asks for the shape of 
a heavy-weighted chain, and so it is of obvious interest to bridge 
builders. Galileo had suggested in 1638 it would “assume the form of a 
parabola’, but in 1646, Huygens showed that this is incorrect.’ The 
problem was first solved by Leibniz and Johann Bernoulli 
independently in 1691, and then raised again by Jakob Bernoulli in the 
Acta Eruditorum in 1701 as a challenge to the mathematical 
community.® This invites the question of what Jakob Bernoulli thought 
he was doing asking a question that had been dealt with successfully a 
decade before, and Paolo Fregulia, whose account is our guide here, 
speculates that his aim was “to put this solution in a more theoretical 
general context”, namely, isoperimetrical problems.’ 

Johann Bernoulli's solution, written in 1701 but published only in 
1706, proceeded by first expressing the force on an infinitesimal piece, 
AB, of the chain in terms of the weight of the string hanging below A, 
then by calculating the force at the neighbouring point B, and then by 
arguing that since AB does not move, the effect of the forces at A and B 


must cancel out. So the raw ingredients are the length, s, of the chain 
from the far end E to the point A—which is proportional to its weight— 
and the differentials dx and dy which come in because the string is 
curved and so the forces at A and B do not point in quite the same 
direction. The differential equation Bernoulli obtained is 


dy _s 
dx a 
where a is a constant. Stage (ii) was completed when the variable s, 
which depends on the values of x and y, was eliminated in favour of an 


explicit expression involving x and y. We omit the details of how this 
was done, and pass straight to the resulting equation: 


adx 


V2 + 2ax 


This completes stage (ii). To carry out stage (iii), Bernoulli noticed, as 
Newton had done much earlier, that the simplest kind of differential 
equation one could hope to get was of the form 


ay= 


(something in x) dx = (something in y) dy 


because you could then hope to integrate both sides. The method of 
trying to arrange for this to happen he called the method of separation 
of variables. When the variables do not separate, Bernoulli enriched the 
method by suggesting that one looks for new variables with respect to 
which the differential equation does separate. This meant setting aside 
all qualms about the nature of differentials and manipulating them 
formally just as one does finite quantities in elementary algebra. 
Bernoulli’s techniques are simple algebraic devices—his insight was in 
seeing that such methods are applied in their new setting. In so doing, 
he was following the lead of Newton and Leibniz. 

These ideas are well illustrated by Bernoulli's discussion of 
Debeaune’s problem, which he gave in one of a series of lessons to the 
Marquis de I’H6pital in 1691.1° 


Fig. 1.3 Bernoulli's formulation of Debeaune’s problem 


Another such example is the problem set to M. Descartes by M. 
Debeaune, the solution to which is not in his works but can be 
found in his Letters (vol. III, No. 71). The solution of it does not 
appear to be very easy according to our method, indeed at first 
sight the problem appears impossible by this method. But we 
shall see that by a change of variables it becomes easy to 
separate them, and that this problem can be solved completely 
once the quadrature of the hyperbola is given, for the curve is 
mechanical (Fig. 1.3). 

The problem goes like this: a line AC makes an angle of halfa 
right angle with the axis AD, and F is a given constant line 
segment; what is the nature of the curve AB in which the 
ordinates BD are to the subtangents FD as the given E is to BC? 

Solution. Let AD = x, DB = y, E = a, suppose by hypothesis 


that dy: dx =a: (y— x), then adx = ydy — xdy. From this 


equation the nature of the curve is to be found, either by 
integration or by rewriting y with dy on one side and x with dx 
on the other, for then two areas can be found and by comparing 
them the nature of the curve can be found. But the equation just 
found cannot be integrated, nor can x and dx be separated from y 
and dy; however, it can be changed into another by substituting 
the value of another variable. Therefore let y-x =z, y-—x =z 


and dy = dz + dx. The equation just found transforms into this: 
adx = zdz+ zdx or adx = zdz+zdx and dx = zdz: (a — 2). 


Therefore these two variables separate, and we are led to the 
curve on multiplying by a, adx = azdz: (a — Z). 


La) 
Corollary |. The curve AB has its asymptote parallel to AC. 
Corollary II. The space [i.e. area] ADB = xy + ax — syy: 


We see that Bernoulli first stated the problem, then he introduced 
coordinates (stage (i)) and then differentials (stage (ii)). Aware in 
advance that the method of separation of variables does not apply to 
the differential equation adx = ydy — xdy, he introduced the change of 


variable y — x = z, which implies dy = dz + dx, thereby extending a 

method used for finite quantities to differentials. Now he had an 

equation in the variables x and z in which the variables do separate, 
i 


axe dz. 
a-Z 


As was thought necessary at the time, he did not accept the solution to 
the differential equation that results from integrating both sides, 


x =-z-aln(a-— 2), 


but went on to give a geometric interpretation of the result. Helpful 
though that can be, the tradition was to lapse in the face of too many 
cases where it could not profitably be done. 

However, if we look at some ideas that Bernoulli wrote down only a 
few years later, in 1702, we can see the growth of formalism.!! By now, 
he was quite clear that expressions suchas dx _ , where fis a constant, 


x+ f 


are differentials of logarithms, and therefore that 


if dx 
x+f 
is a logarithm. He now introduced a variety of changes of variables to 
deduce that certain previously encountered integrals can be expressed 


in terms of either logarithms or circular arcs. His list includes the 
integral 


7 dx 

Vix? + 2ax 

which arose in the catenary problem, and the logarithm lurking in 
Debeaune’s problem. Bernoulli regarded his changes of variable as 
enabling him to pass from circular arcs to arcs of hyperbolas and back, 
which made his geometric interpretations of analytic formulas more 
flexible. Interestingly, they involved him in introducing complex 
numbers, which was to occasion some confusion later on. 


1.4 Differential Equations 


We turn now to study how Euler rewrote the calculus. 

The Leibnizian form of the calculus, which was the form adopted by 
the mathematicians of continental Europe, was initially seen as a set of 
algorithms for handling problems about curves. These algorithms work 
because they apply to formal expressions involving variables, and the 
two basic operations, differentiation and integration, d and i , obey 


rules such as 


d(uv) = udv + vdu and a fu = 


The connection between these formal operations and geometry arises 
from their geometrical interpretations—for example, d has to do with 
finding tangents, and f with areas. 


Euler rewrote the calculus by regarding it being about formal 
expressions and replacing the concept of a curve with that of a function. 


Calculus is about expressions that can be differentiated and integrated. 
It is only about curves in so far as they can be described by formal 
expressions, which allow one to use differentiation to find tangents and 
so forth, and the solution to a differential equation is not to be 
expressed as a curve but as an explicit or implicit function of the 
coordinates. 

As is well known, Euler’s analysis of sine, cosine, and the 
exponential functions is unified by the function idea. He expressed 
them as power series and treated them formally or algebraically, there 
is very little geometry. In particular, his controversial solution to the 
problem of defining the logarithm of a negative number proceeded by 
defining log as the inverse function to exp.'? 


Euler’s emphasis on the calculus, and indeed much of mathematics, 
as a science of formal expressions widely restructured mathematical 
theory. His treatment of Debeaune’s problem provides another 
example. Euler went from the differential equation 


_  zdz 
Ca 


ax 


to an answer in this form: 


x+z+alog(a— z) = constant. 


He saw this equation as the answer, and saw no need for a geometrical 
interpretation. Indeed, in his definitive account of the integral calculus 
(published between 1768 and 1770) he did not even mention 
Debeaune’s equation by name when he gave a complete account of how 
to solve all differential equations of the form 


(a + Bx + yz)dx = (6 + Ex + pz)dz 


(this equation reduces to Debeaune’s on setting 
a=a,p=O=d=e,y=—-l p= 1). 


We shall see that Euler’s mathematics is full of investigations of 
objects defined by differential equations or integrals. Problems are 
expressed as equations (finite or differential) and solved by finding 


power series expansions or other algebraic reformulations. If one sees 
mathematics as having three aspects—problems, methods, and results 
—then one might say that Euler very often saw problems algebraically 
and solved them algebraically, expressing his results either in finite 
terms or as infinite series.!* 


1.5 Linear Ordinary Differential Equations 


It had been known from the time of Newton and Leibniz that the 
calculus provides good answers and not-so-good answers. A three-way 
correspondence between Euler, Johann Bernoulli, and Johann’s son 
Daniel provides a good illustration of how this difficulty was addressed 
in the context of finding the shapes of a vibrating, clamped rod (setting 
aside the question of how the shape varies in time). 

Daniel raised this problem in a letter to Euler of 18 December 1734 
(setting aside the question of how the shape varies in time). He wrote 
to Euler again on 4 May 1735 to say that he had found a differential 
equation that describes its shape, but that the only solutions he could 
find to the equation, which involved sines and exponentials, seemed 
inappropriate. Euler replied with a solution in the form of a power 
series, which he wrote up and presented as a paper [71], E40 to the St. 
Petersburg Academy of Sciences. This is not a good answer. As so often, 
the power series is unilluminating, and it concealed from Euler the fact 
that the rod can vibrate in several distinct ways. 

Then, in 1739, Euler spotted a much better approach and found a 
much better answer, which he described in a letter that he wrote to 
Johann Bernoulli on 15 September, ‘* and more fully in a paper (E62) 
published in 1743. 

In this letter, he proposed a simple, general method for all 
differential equations of a form he described, which we could call linear 
ordinary differential equations of arbitrary order and with constant 
coefficients. His method reduces these problems to the solution of a 
polynomial equation and establishes that the answer to the differential 
equation is always given as a sum of exponentials, sines and cosines. He 
wrote: 


I have recently found a remarkable way of integrating 
differential equations of higher degrees in one step, as soon as a 
finite [algebraic] equation has been obtained. Moreover this 
method extends to all equations which, on setting dx constant, 
are contained in this general form: 

ad bdd cd> dd* ed? 

renee gee gee aes icien=): 

dx dx? dx dx* = dx 
To find the integral of this equation I consider this equation or 
algebraic expression: 


l-ap+bp’ -cp?+dp*—ep +etc. = 0. 


If possible this expression is resolved into simple real factors of 
the form | — ap: if, however, this cannot be done resolve it into 


factors of two dimensions of this form | — ap + bpp, which 


resolution can always be done in reals, for whatever form the 
equation may have it can always be put in the form of a product 
of factors either simple, | — ap, or of two dimensions 


1 -—ap + fpp, all real. This resolution being done, I say that the 


value of y is a finite expression in x and constants, obtained from 
all the members which have been factors of the algebraic 
expressions, and singular members supply singular terms of the 
integral. Certainly the simple factor | — ap gives as member of 


the integral Ce*/¢, and a composite factor 1 — ap + bpp gives 


this member of the integral 


xV/4B-aa x/4B - aa 
mol sina ye ee +D cosa Ve 


where for me sin A. and cosA. denote the sine and the cosine of 


arcs ina circle of radius = | : however it is to be noticed that if 


the expression | — ap + Ppp cannot be resolved into simple 
real factors, when 46 > aa, still the integrals are real. 
Let the following be taken as a suitable example 
K+d4 
a 0: 
dx* 


this gives rise to the algebraic expression | — K* p whose real 


ydx* = K*a*y, or y- 


factors are these three 1 — Kp, 1 — Kp, | — K*p*; and from 
these spring the integrals of the equation 


y=CeV* + De 4 EsinA.— + FcosA.=: 
K K 
in which expression, because a four-fold integration has been 
done in one operation, there are four new constants as the 
nature of the integration demands. If it would please you, most 
excellent sir, | shall write down the method of proof on another 
occasion. 


It is not clear how Euler came upon his brilliant idea. It falls out, 
however, as soon as one tries to see if the differential equation is solved 
by functions of the form y = e ’*. Because 


dy__ dy 
dx ae 


when y = e ”* is substituted into the equation, the resulting equation 


= p’y, and soon, 


is 
—px 2 3 4 5 = 
e'"(1-ap+bp* —cp’+dp'-—ep +etc). = 0. 
The expression e ”* is never zero, so it can be divided out, and 


therefore, as Euler claimed, y = e’”” is a solution of the differential 


equation if p is a solution of the polynomial equation. 
To find the values of p, Euler claimed that the polynomial equation 
can always be factored into linear terms of the form | — ap and 


quadratic terms of the form | — ap + fpp. This is an example of what 


came to be called the fundamental theorem of algebra, which was 

widely believed at the time, but not proved. Euler then solved these 

equations for p and found p = 1/a@ and _ at Vo2-4g, respectively. 
oe ae 


Notice that the second expression is equal to + +/4g-02- 
2p 


The first case leads to the solution = e */«, The second case leads 


to the solution 


Now he already knew that 
e?*'1 = e?(cosq + ising), 


so he saw that 


ee + a a = exp (= [oo ah cl ee iii ee 
28 28 26 26 26 


The term involving the sine function remains a solution when 
multiplied by any constant, and so the factor of i can be removed by 
multiplying by i, and Euler’s solutions are finally obtained. 

As we remarked, Daniel Bernoulli had already noticed that 
exponentials, sines, and cosines were among the solutions, but 
Euler was the first to see that every solution could be written in terms 
of them. As a result, his new solution to the differential equation for the 
vibrating rod is much better than his earlier power series solution, 
because it becomes possible to see what some of the solutions actually 


look like, and this is very instructive. For example, each of the next four 
functions is separately a solution: y = e~*/*, y = e*/*, y = sin(x/K), 


and y = cos(x/K), which can be called its basic modes of vibration of 


the rod. Moreover, the new approach brings to light that the shape of 
the rod at any instant is a certain sum of these basic modes with 
constant coefficients. 

Furthermore, because the rod is fastened to the wall and protrudes, 
say, horizontally, and the mortar is secure and immovable, any solution 
is subject to the two initial conditions that when x = 0 necessarily 


y =O and a — (): This eliminates some combinations of the basic 
X 


modes, and the allowed solutions (at any moment of time) are all of the 
form 


ae** + Be-** — (a + B) cos Kx — (a — B) sin Kx. 


This also explained what Daniel Bernoulli had noted experimentally: a 
thin rod clamped to a wall can be made to emit several different sounds 
as it is plucked, and indeed, several different sounds at once because it 
can be in several distinct shapes.!° 

Euler informed his 72-year-old former professor Johann Bernoulli 
of his claims about this class of differential equations but did not send 
him a proof. Not to be outdone, Bernoulli replied with a proof in early 
December 1739, but his approach is interestingly old-fashioned. He 
first showed how to reduce the differential equation to a polynomial 
equation, much as Euler had done. However, Bernoulli adhered to the 
geometric language that Euler’s work would gradually drive out, and 
interpreted the solutions, which he wrote in the form y= n*/P, as 


“logarithmic curves whose subtangent is to be found”.‘° In addition, 
Bernoulli regarded the equation p+ — K* = Q as the same as p= K 


and remarked that whereas he had one solution, Euler had exhibited 
several. He commented that for this to be the case “my logarithms will 
be impossible or imaginary, but it is also the same in your solution, 


allowed to be more general, for you must let K be impossible or non- 
real”. This shows that complex numbers were puzzling when they 
occurred in problems involving real quantities, but were nonetheless 
accepted, perhaps as something that needed to be better understood. 

Euler’s attitude to what he would consider an answer to a 
differential equation makes one crucial advance over Johann 
Bernoulli's. The methods they used to solve differential equations were 
not so very different: changes of variable, cunning substitutions, and so 
on, although certainly Euler’s insight that reduces these differential 
equations to polynomial equations was a breakthrough. But changing 
what was considered an acceptable answer was an essential ingredient 
in advancing the calculus—and Euler seems to have had some success 
in convincing Johann Bernoulli, too, of the value of thinking in this way. 

The first two volumes of Euler’s Institutionum Calculi Integralis 
(E342 and E 366) give a good indication of what he could do by the late 
1760s (the book was presented to the St. Petersburg Academy in 
August 1766; volume 1 was published in 1768, volume 2 in 1769). 
Euler investigated a number of different kinds of ordinary differential 
equations, looking for simplifications and for general methods. Quite an 
amount of insight into complete solutions, particular integrals, and 
initial conditions is accumulated. In volume 2 Chap. 8 Euler considered 
the second-order ordinary differential equation, and began by solving 
the linear equation by the method of undetermined coefficients. (A 
particular type of this equation is the hypergeometric equation, which 
we Shall investigate in Chap. 11). 

Given the equation 

dy dy 
Te eM aN X, 

in which M, N, and X are functions of x, Euler began, as usual, by taking 
particular cases. His first example was 
d*y dy 


2 * x(c + dx") + (f + gx")y = 0. 


He looked for a solution of the form 


x°(a + bx") 


x"(A + Bx" + Cx" +--+), 
and by looking at the lowest power of x was led to the equation 
A(A-l)at+Ac+ f =0. 
Once this is solved, the constants B,C,... can all be found recursively 


in terms of A. 
Euler also looked for solutions in which the powers of x decrease, at 
the cases of both real and imaginary values of 1, and the more difficult 


case when the values of J are either the same or differ by an integer 


and the method provides only one solution, not two. In this case, 
Euler found another solution with a logarithmic term. 

It seems that linear differential equations of higher order were 
beyond Euler’s reach not because of the method of series but because 
of problems with the correspondingly higher order equation for J. 


There were, of course, no explicit methods for solving the quintic 
equation, and nor was there a secure proof of the so-called fundamental 
theorem of algebra. So although Volume 2, Sect. 2, Chap. 2 of Euler’s 
book covers the third-order linear equation with constant coefficients, 
the solutions of which are of the form e** for the values of A that 


satisfy the corresponding cubic equation, and Euler dealt with both the 
case of distinct roots and the case of repeated roots, the extension of 
the analysis to higher order equations became lost in the details. 

It had not been a hundred years since Leibniz had struggled to 
master one inverse tangent problem, but by the 1760s, Euler had a 
theory of many different kinds of differential equations. It embraced 
differentials of various degrees; homogeneous equations; solution 
methods that introduced multipliers or employed the method of 
infinite series or relied on a method of successive approximation. The 
formal side of the calculus was evidently being deployed, so much so 
that examples appear for the first time only on page 355—surely a good 
sign that we are in the presence of a theory rich enough to keep mere 
examples at bay. 


1.5.1 A Note on the Adjoint Equation 


I mention here that Euler’s work was extended by the young Joseph- 
Louis Lagrange in a 200-page memoir [171] that he published in 
Miscellanea Taurensis. Lagrange established that a linear equation of 
order n of the form 


teen” et 
saat | uae 7 
where L, VM, N,..., 7 are functions of t, will have n solutions 


(independence is implied but not stated) as the method of 
undetermined coefficients would suggest, and proceeded to investigate 
interesting cases and methods for reducing the order of the equation. 
This led him to discover that one can associate to a given ordinary 
differential equation another that, if solved, enables one to reduce the 
order of the original ordinary differential equation by one. Repeating 
this trick on the new equation returns almost to the original one; it 
actually comes back as 


dy d’y 
Ly+M—+N— +--+» =0. 
dt dP 
In the nineteenth century, the second equation came to be called the 
adjoint equation of the first equation, and so one could say that 
Lagrange had proved that the adjoint of an equation is (the 
homogeneous form of) the original equation. 
It will be enough to demonstrate the trick on the second-order 
equation 
Ly+M dy + N si 
Dap ge 


Lagrange multiplied both sides by an unknown function z = z(t) and 


integrated, to get 


[rears [wBeare f near = | reat 


He integrated the terms involving M and N by parts, so the equation 
becomes 


dM dN 
[ tyedt + Myc [TE yde + ny'c— f TEvar= [rade 


and, integrating by parts again, 


dMz dNz a-“Nz 
Lyzdt + Myz— | ——ydt+ Ny’z- y— dt= | Tzdt 
fo: te {@ ages eS | teat, 


which he rearranged with respect to y, when it becomes 


dN dMz dN 
v(m -F)+4 “Nz + +f i -—s 4 alae . Tzdt. 


So, if z is chosen to be a solution of the equation 


dMz r d°Nz 
dt dt? 


= 0, 


which is of degree one less than before. As Lagrange pointed out, the 
adjoint equation is simpler than the original one because it is 
homogeneous, and if it can be solved the original equation is also 
simpler because has been reduced to one of lower degree. 

Lagrange then turned to other questions: the motion of fluids, the 
vibrating string, the motion of the planets. All in all, it is a formidable 
paper, familiar with the work of d’Alembert on fluids and Euler on the 
vibrating string, although Lagrange sided more with d’Alembert than 
Euler over the generality of the solutions of the wave equation. 


1.6 Exercises 
1. Solve Debeaune’s problem and show that the solution with the 


stated initial conditions has the line x + y = 0 as an asymptote. 


Solve Euler’s equation 


d? d 
x°(a + bx + x(c + dx") + (f + gex")y = 0. 
dx? dx 
What qualitative features of the solution are apparent to you (if 
any)? 


Questions 


1. 


What qualitative features of a function are apparent to you from its 
power series representation? Consider, for example, the series for 
exp, cos, and sin. Is it at all obvious that the series for cos and sin 


define periodic functions? 


How would you attempt to graph the equation for the tractrix if you 
did not know of its origins as a problem in physics? In the light of 
your experience, how well do you think a mathematician of the late 
seventeenth century could claim to understand a curve knowing 
only its equation? 


Footnotes 


1 Descartes’ La Geometrie was published in 1637 as one of a number of appendices in his 
Discours de la Méthode. 


2 Debeaune to Marin Mersenne, March 1639, Mersenne Correspondance VIII, 348, in F&G 11. 
B1(b). Marin Mersenne was a Minimite friar who operated an informal postal service for the 
communication of letters across Europe about science and mathematics. 


3 Debeaune, letter to Roberval, sent to Marin Mersenne, Mersenne Correspondence VIII, 142- 
143, in F&G 11.B1(a). 


4 See LH6pital [186]. 


5 For the same reason, British mathematicians spoke more and more of fluxional equations 
because Newton had expressed his ideas in terms of fluxional quantities, such as the rates of 
change of quantities. 


6 This was an issue that Descartes recognised when he put forward his ideas about geometry 
in 1637. 


7 See Galileo Two New Sciences [113], 149, and Huygens [149]. For a historical account, see 
Bukowski [26]. 


8 See Bernoulli [8]. The solution is the curve called a catenary—the name derives from the 
Latina, catena, for a chain—with equation y = coshx. 


9 See Fregulia and Giaquinta [109]; the quoted remark is from a private communication. 


10 See Bernoulli, J. Opera Omnia 3, 1742, 423-424, and F&G 13.B1. 


11 See Bernoulli [11] and F&G 13.B2. 


12 See Euler [78]. 


13 One person’s method can be somebody else’s problem, and so forth, but the trichotomy is 
no less useful even so. 


14 See Enestrém [66], 33-38, Cannon and Dostrovsky [28], F&G 14.A1(a), and Euler [99], 
00213. 


15 Bernoulli used a needle. 


16 The subtangent to a curve at a point P is the distance from the point where the tangent 
meets the x-axis to the point on the x-axis vertically above or below P. So, if P has coordinates 
(x,y)and dy __atP,thenthe subtangentis .. 


dx? P 
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2.1 Introduction 


Inspired by calculus, which made problems look simple that not long 
before no one had dared to raise, mathematicians began to ask a variety 
of questions about curves. We met some in the previous chapter that 
led to inverse tangent problems, but others were to lead to a new 
branch of the calculus and ultimately to new principles for the study of 
mechanics. As such, if they were solved at all they were solved by 
ingenuity rather than a systematic method, but—as we shall see—the 
insights that were produced on the way were often deep and lasting. 

Very likely the oldest, most attractive, and most famous problem of 
the kind we are about to discuss is known as Dido’s problem, which 
asks for the shape of the largest area bounded by a straight line and a 
curve of given length, or, in some versions, the greatest planar area 
enclosed by a curve of given length. According to various mythological 
sources, Dido fled the city of Tyre, perhaps around 825 BCE, and came 
to a place on the North African coast where she was granted permission 
to have much land as a strip made from oxhide could enclose. She cut 
the hide long and thin, and enclosed an area upon which the city of 


Carthage was founded—but which problem she solved, and what her 
solution was, mythology does not make precise.” 

The problem was brought to people’s attention in one of Lord 
Kelvin’s popular lectures in 1893, and the mathematician Adolf 
Kneser solved it his textbook [160] in this form: Find the curve of given 
length joining two points A and B that, together with the chord AB, 
encloses the greatest area. The solution is a circular arc with the line as 
a chord, as we shall see in Sect. 26.3. But he returned to it in a paper 
[161], where he pointed out that the problem is far from being the 
simplest in the calculus of variations. It is entirely possible that Dido 
could have taken the problem to mean: Find the curve of given length 
joining two points on the coast that encloses the greatest area, and this 
allows for the end points to be variable. In this case, too, the solution 
curve must be a circular arc, but it is not clear that the maximum area is 
necessarily attained.° 


2.2 Bernoulli's Problems 


In June 1696, Johann Bernoulli added a challenge to the mathematical 
community to a paper of his in the journal the Acta Eruditorum: 


Given points A and B in a vertical plane to find the path AMB 
down which a movable point M must, by virtue of its weight, 
proceed from A to B in the shortest possible time. 


This problem, the problem of quickest descent or, to give it the name for 
the curve that Bernoulli gave it from the Greek, the brachistochrone, is 
emblematic of the topic. Bernoulli added that the solution was not the 
straight line going A and B but was in fact a curve well known to 
mathematicians, and promised to publish the solution if no one could 
find it. He also sent the problem in a letter to Leibniz on 9 June 1696, 
who replied on 16 June with a solution and the suggestion that 
Bernoulli delay publishing the solution because the journal travelled 
only slowly across Europe. 

In May 1697, Bernoulli published his solution, and one by his 
brother Jakob, as well as discussions of the problem by Tschirnhaus and 


de l’H6pital. Leibniz withdrew his solution, saying that it was too close 
to the ones proposed by the Bernoulli brothers. 

Before we look at Johann Bernoulli’s solution, we should note that 
the problem was not original with him. It had been proposed before, by 
Galileo, in his Dialogues Concerning Two New Sciences, where he gave a 
fallacious argument that purported to show that the solution was an arc 
of a circle.* 

This mistake comes at the end of the long dialogue on the third day 
of the Two New Sciences, where Galileo recorded his lasting 
contribution to the study of motion: his laws of falling bodies.° He had 
studied the times taken by balls to roll down inclined planes of various 
slopes, thus slowing their rate of descent to lengths of time that could 
be measured by accurate, regular counting. He was led to proclaim that 
a uniformly accelerated body will fall as far in an interval of time as one 
moving with a constant velocity that is the average of the initial and 
final speeds of the first body. Furthermore, the distance covered by the 
accelerating body in equal intervals of time increase with the square of 
the time. 

Then he noted (see Fig. 2.1) that “If a body falls freely along smooth 
planes inclined at any angle whatsoever, but of the same height, the 
speeds with which it reaches the bottom are the same”. Crucially for 
present purposes, he immediately remarked (Theorem III, Proposition 
II) 


If one and the same body, starting from rest, falls along an 
inclined plane and also along a vertical, each having the same 
height, the times of descent will be to each other as the lengths 
of the inclined plane to the vertical. 


O 


P\ v 
v1 


Fig. 2.1 Galileo’s law of falling bodies 


We would see this by resolving the velocity along the slope into its 
horizontal and vertical components. One ball falls from Oto P,a 


distance of S, which it reaches with velocity v;. Another ball rolls from 


O to P, a distance of /, which it reaches with velocity v. We have, by 
conservation of energy, 


vy, =v, 1, =Icosa, 


So 
V{ Vy 


i ~ Teosa’ 


We also know that, because the acceleration is uniform, the time to fall 
from rest through a distance h is h times half the velocity at h, so 


2 2 
t}=l— andt=l-, 
V1 Vv 


SO 


as Galileo claimed. 

Galileo’s argument was not that different: he measured the 
magnitude of the force acting on the ball on the slope by the weight of a 
ball hanging vertically down from the top of the slope that was attached 
to the first ball by a cord and such that the two balls did not move. 

The quantification of velocity and acceleration by Galileo was to be 
exactly what was needed to study motion with the advent of the 
calculus. 

Equal significance is attached to Fermat’s argument that the path of 
light in a varying medium is the one that takes the least time. In the late 
1630s, he had already been in an argument with Descartes about the 
refraction of light and the explanation of Willebrod Snell’s law for 
refraction. Snell’s law states that when a ray of light leaves one medium 


and enters another there is a constant ratio r, determined by the two 
media, such that 


sin 6, / sin @> = 1, 
where 6; and § are the angles the light makes with the normal at the 


point of crossing in the two media (see Fig. 2.2). 


Fig. 2.2 Snell’s law of refraction 


In 1657, Fermat learned of the Greek mathematician Heron’s ideas 
about why light travels in straight lines (in a constant medium) and saw 
that he could adapt them to explain refraction. He supposed that light 
travels at one speed in air, say v;, and another, slower, speed in water, 


say v;,and then wrote down the time of travel between a point in the 


water to a point in the air on the assumption that it travelled along 
straight lines in each medium. His ad hoc techniques for finding the 
minima of certain quantities were up the task, and he deduced that in 
the present set-up, 


sin@; vy 
sin ep) v2 
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Fig. 2.3 Fermat’s deduction 


We can argue the same conclusion slightly more rigorously. We 
choose a point P; that is a; units below the surface of the water, and a 


point P; thatis a; units above it, and we suppose they are d units 


apart horizontally. We suppose that the light travels along a straight line 
from P; toa point P on the water surface and then along a straight line 


to P;. With angles with the normal at P as given, we have for the 
horizontal distance (Fig. 2.3) 


a, tan, + a> tan@> = d. 


The distance travelled in the water is p = 1/a, and in the air is 
p = 1/a, and so the total time taken is 
a,sec@,; ada ,sec@> 
T = —— + ——.. 
Val v2 


From the first equation, we find by differentiating that 


dg 

2 2 2 

a, sec’ 0; + dy sec” 62 — = 0. 
1 1 2 2 10, 


From the second equation, we deduce that 


dT  a,sin@ sec? 6, ; Ay SiN 95 sec? 6> dO> 


do, - V1 i) da, 
For the shortest time, we require that 
dT 
= = Oi 
dé, 


Eliminating d0@) from these equations, we find that 
dé, 


p= sin 0, sec? @; do Sin > sec? @> (= sec? | 
= eS 


Vv} V2 A Sec” 3 
which simplifies to 


sin cat al 


sin (ep) v2 


The constant in Snell’s law is revealed to be the ratio of the velocities of 
light in the two media. 

So Fermat’s principle that light takes the least time to travel 
between two points led him to a theoretical derivation of Snell’s law— 
but on what basis? Can this principle really be fundamental, or is not 
the case that light behaves as it does for some (possibly unknown) 
reason and that this reason implies the principle? Can a principle such 


as least time act as a cause? These questions were not to be answered 
for a long time, and Bernoulli himself is our source for the information 
that® 


Leibniz in the Acta Eruditorum, 1682, pp. 285 et seq., and soon 
after the famous Huygens, in his Treatise on Light, p. 40, have 
demonstrated this more comprehensively and by most valid 
arguments, have established the physical, or better the 
metaphysical, principle which Fermat seems to have abandoned. 


2.3 The Bernoullis’ Brachistochrones 


Johann Bernoulli tackled the brachistochrone problem by what was to 
become a standard method in mechanical questions: replace the 
problem by a number of discrete problems and let these problems 
crowd together and their number increase indefinitely until they tend 
to the original problem of interest.’ 

In this case, Bernoulli considered the path of light through a 
sequence of horizontal layers of translucent material, each layer having 
a different density. As he put it, they are made of “a diaphanous matter 
of a certain density decreasing or increasing according to a certain law’. 
At each boundary, Snell’s law applies and so the path of light through 
these media can be determined. 

How he had this idea is not known, but it is clear enough that by 
adjusting the density of the diaphanous layers a wide variety of paths 
can be obtained, just as a varying law of acceleration can. So “In this 
way we can solve the problem for an arbitrary law of acceleration, since 
it is reduced to the determination of the path of a light ray through a 
medium of arbitrarily varying density”. 

Then by looking at an infinitesimal moment, Bernoulli deduced that 
if the moving particle goes from a point (x, y) to a point (x + dx, y + dy) 


and its velocity increases from v to v + dv then, by Snell’s law, 


dx | 


dz a’ 


for some constant a, where dz is the amount of motion along the 
tangent, so dx? + dy* = dz’. This gave him that 


ax v 


dy — Va2 = 2 


He then took from Galileo's law of falling bodies that v = ./ay, and so 


Gee | y dy, 
a—y 


and so, he said, the brachistochrone is a cycloid. 
Neither for him nor for us is this deduction entirely easy to make. 


We can write 
a 7 dy. 
0 Vay 


Set y = asin’ t, SO dy = 2a cost sin tdt, and the integral becomes 


CE. a a 
x= 2a [ sin? tdt = —= sin 2t + at = = (2t — sin2r). 
0 2 2 


We can write 
sin t= si — cos 2f) 
and so express the solution curve parametrically in the form 
CS 5 (2t —sin2t), y= 5(l — cos 2f). 


This is the equation of a cycloid that starts at the origin. Moreover, this 
cycloid has a vertical initial tangent, because when t = 0 dx dy 


dt at 


but dx and q’y 
— -(0 —-=+=?2 0) 
dt? dt? ca 


The constant a can be determined from the separation of the end 
points, and once that is given the cycloid is unique. 

Jakob Bernoulli's solution was different and seems to have 
influenced Euler a generation later. He argued that if a path between 
points A and B is the path of quickest descent then it must be the path 
of quickest descent between any two of its points.® For, if it was not the 
path of quickest descent between two intermediate points C and D, say, 
then that path could be replaced with a quicker one and this would 
shorten the time of descent from A to B as well, which was assumed to 
be a minimum. 

He then formulated this insight as an infinitesimal statement about 
a piece of the quickest path as compared to some nearby path, drawing 
on Galileo’s laws of motion. By an argument here suppressed he arrived 
at the same integral, and therefore the same solution, as his brother. 


2.4 Geodesics on Surfaces 


A geodesic on a surface is a curve of the shortest length that joins two 
points on the surface and lies entirely in the surface. In the plane, a 
geodesic is a straight line; on the sphere, it is an arc of a great circle (the 
circle cut out by the plane that passes through the given points and the 
centre of the sphere). 

In 1697, Johann Bernoulli challenged the mathematical community 
to investigate geodesics on curved surfaces. His insight into the 
problem had to do with the best approximating plane to a curve. He 
considered a geodesic on a surface and passing through a point P, and 
he looked at two points P’ and P” on the geodesic that tend to P. He 
argued that the plane through these three points on the geodesic then 
tends, as these points tend together to the point P, to the plane 
containing the tangent to the geodesic at P that is perpendicular to the 
surface at P. 

How might we come to believe that? We might argue that if the 
geodesic is traversed at a constant speed then its normal is 
perpendicular to the curve, the normal and the tangent between them 


define the plane in question, and for a geodesic, the normal is also 
perpendicular to the surface. Or, we might argue that the claim is true 
for a sphere, so it is true for the best approximating sphere to the 
surface at P and because these surfaces are arbitrarily close in the limit 
what is true for the sphere is true for the surface. Of course, we need 
another argument for surfaces that are saddle-shaped near P. 

Bernoulli's argument was something like the first of ours, but in 
reverse. From the original geometric insight, he deduced that the 
curvature vector is proportional to the normal vector at every point on 
a geodesic. Thus, he could interpret the requirement that the curvature 
vector of a curve in a surface be normal to the surface as an equation 
for a geodesic. Even so, this did not lead to a solution to the problem 
except in special cases, and the problem lay fallow for 30 years until 
Bernoulli proposed it to the young Leonhard Euler in 1728. 

It is worth noting that it follows from Johann Bernoulli’s 
characterisation of a geodesic that force-free motion along a surface is 
along a geodesic if force-free is taken to mean no forces acting on the 
surface (or, if you prefer, there are no forces acting that have a non-zero 
component in the tangent plane). 

Euler published a short paper on geodesics in 1732, in answer to a 
question from Johann Bernoulli. He considered a geodesic GMH ona 
surface, where the points are infinitely close together and M is the 
midpoint, and the plane through M parallel to the (y, z)-plane cuts the 
surface in the curve JMK. 

Euler took Cartesian coordinates in space—one of the earliest times 
this had been done—and wrote down the distances GM and MH on the 
assumption that they were well approximated by infinitesimal line 
segments. He said that the coordinates of the points were 


G=(a,b,c) Mz=(ata,y,z) H=(a+2a, f,g), 


so, by the three-dimensional Pythagorean theorem, distances between 
the points are 


GM = ,Ja* +(y—b)? +(-c)*, MH= Jo*+(f-y)+(g—-2. 


For this to be a minimum as M varies the differential of GW + MH 
must vanish, and this implies that 

(y—b)dy+(ze-c)dz (ff —y)dy + (g - 2) dz 
It remained for Euler to eliminate the arbitrary infinitesimals a, y, z, f, 


and g. After some work, here omitted, he obtained a second-order 
ordinary differential equation that he was able to solve in simple cases, 
when the surface is a cylinder, a conoid (a cone on a plane curve), ora 
surface of rotation. 


2.59 Exercises 

i 
Obtain a formula for the radius of curvature of a curve and show 
that for a curve traversed at unit speed the radius is the reciprocal 
of the magnitude of the acceleration. 


Use Bernoulli’s insight in Sect. 2.4 to find the geodesics on a sphere, 
a cylinder, and a cone. 


Questions 


1. 
Information about tangents to a curve and its radii of curvature 


translate into information about velocity and acceleration along a 
curve. Why do you think this often struck mathematicians in the 
eighteenth and early nineteenth centuries as enough? 


Footnotes 


1 This chapter follows Fregulia and Giaquinta [109] to which readers are referred for much 
fascinating information. 


2 In Virgil’s Aeneid, Dido then falls in love with Aeneas, who was seeking a new home after the 
destruction of his native Troy, but when he abandons her she commits suicide, calling down 


endless hate upon him. This was the origin of the Punic war centuries later between her city, 
Carthage, and Rome, the city founded by descendants of Aeneas. Virgil’s poem is most likely 
incompatible with what we know of the Trojan wars. 


3 We look at a solution to this problem on Sect. 26.3.1 below. 


4 See Galileo [113] Theorem XXII, Proposition XXXVI, Scholium. 


5 For an extract, see F&G 10.B4. 


6 See Fregulia and Giaquinta [109], 40. 


7 See Bernoulli [9]. There is an English translation of this paper and Jakob’s in Struik Source 
Book 391-399. 


8 Ifthe points A and B do not lie in a vertical line then the particle must start with some non- 
zero velocity. 


© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 
J. Gray, Change and Variations, Springer Undergraduate Mathematics Series 
https://doi.org/10.1007/978-3-030-70575-6_3 


3. The Vibrating String and the Partial 
Differential Calculus 


Jeremy Gray! 
(1) School of Mathematics and Statistics, Open University, Milton 
Keynes, UK 


Jeremy Gray 
Email: jeremy.gray@open.ac.uk 


3.1 Introduction 


The study of problems involving more than one independent variable 
and the extension of the calculus to deal with these problems were 
significant advances of the first half of the eighteenth century. The first 
significant success was d'Alembert’s mathematically correct description 
of the vibrating string, which has become famous as being the first 
partial differential equation to be solved, and although that title that 
can be disputed on a technicality, this should not be allowed to mask 
the real breakthrough his analysis achieved. Here we examine what he 
did to formulate and solve a problem in two independent variables and 
show how it enabled many basic phenomena of musical sounds to be 
explained.! 

In the case of two independent variables, d’Alembert and 
Euler found it natural to introduce formal complex variables, but this 
raised questions they were unable to answer about the implications of 
using them.” 


3.2 Early Investigations into the Partial 


Differential Calculus 


The first person to extend the calculus systematically to two 
independent variables was Nicolaus I Bernoulli in unpublished work in 
1719. He had been led to it through his work on a problem of 
contemporary interest, the determination of a family of orthogonal 
trajectories to a given family of curves. Each curve in the orthogonal 
trajectory is specified by a parameter aq; the coordinates of a point ona 


given curve are functions of a parameter t. 

We know from unpublished manuscripts that Bernoulli deduced the 
equality of mixed partial derivatives from the following observation’: 
moving from an initial point ¢ on a curve with parameter q@ to the point 


t + dt and then along the orthogonal trajectory to a point on the curve 
with parameter a + da is the same as first moving to the curve with 
the parameter a + da and then changing t to t + dt. Euler argued 


much the same way independently in his “De differentiatione” of 1730, 
which he also left unpublished for several years. When he eventually 
did publish this result in another paper [72] Nicolaus wrote to Euler in 
1743 to say that he had not published it himself because he regarded it 
as an axiom “which | thought to be obvious to anybody from the mere 
notion of differentials”. 

The impetus to extend the calculus to functions of two variables had 
a second source in the study of ordinary differential equations, 
specifically when mathematicians were led to consider inexact 
differentials. These are expressions of the form a(x, y)dx + b(x, y)dy 


that cannot be written as d(g(x, y)). Alexis Claude Clairaut, in his paper 
[43] began by noting that if 


a(x, y)dx + b(x, y)dy = d(g(x, y)) 


then necessarily, by the equality of mixed partial derivatives, “ = o 
'y x 


and conversely, or so he claimed, if this condition is met then the 
differential is exact. This is only true if the functions a(x, y) and b(x, y) 
are defined everywhere in a simply connected region, as the counter- 


example a(x, y) = x/(x° + y*), bx, y) = y/(x? + y’) shows; this 


differential is not exact. But Clairaut deduced the theory from a 


consideration of monomials of the form x’"y”" because he believed that 


every function of two variables is expressible as an infinite sum of such 
monomials.” 

When the differential is not exact he proposed to look for a factor 
L(x, y) such that 


L(x, y)a(x, y)dx + u(x, y)D(x, y)dy 


is exact, and this led him to the partial differential equation 


A(ua) — (ub) 
Oy Ox.’ 


or, equivalently, 


Clairaut was able to give a number of ways of finding the integrating 
factor A in particular cases. 


Clairaut had heard about Euler’s work on the subject from Daniel 
Bernoulli and wrote to Euler about it on 17 September 1740, enclosing 
copies of some of his papers. Euler wrote back on 19 October to say 
that he was very pleased with them, and this started a productive 
association between the two, a highlight of which is Clairaut’s analysis 
of the motion of the Moon that was one of the decisive papers in the 
Continental acceptance of Newtonian gravity in 1749.° 

All this work led to the emergence of partial differential equations 
as a topic of investigation. The first important problem to be solved in 


the theory of partial differential equations was the problem of the 
vibrating string, and historians place the emphasis here, rather than on 
the question of integrating factors, because it marks a real if intangible 
shift towards the full acceptance of two independent variables. 
Clairaut had seen the question of finding an integrating factor as a 
question about differentials and not as a question in the subject of 
partial differential equations; such a subject did not exist and he was 
not inspired to create one. 

But even the story of the wave equation tells us that a new field of 
enquiry was only gradually being born. However, natural it might seem 
for someone trying to create a general theory of second-order partial 
differential equations to take the wave equation as a major example, 
d’Alembert never wrote down the wave equation when he studied the 
motion of the vibrating string in his [52]. Even at this stage, the 
problem of the vibrating string was a problem in the partial differential 
calculus rather than in the theory of partial differential equations, 
which was still to be created. 


3.3 D’Alembert: The Vibrating String and the 


Wave Equation 


The problem of the vibrating string had by then attracted the attention 
of mathematicians for over a century because of its close connections to 
music. Every musician knows that a violin string makes a predictable 
sound and that tightening the string raises its pitch, as does shortening 
it. In 1638, Marin Mersenne had stated this law for determining the 
frequency of vibration of a string: 


y= ZF, 


where v denotes the frequency, ¢ the length, and T the tension in the 
string and C is a constant (determined by the nature of the string). 


However, neither he nor anyone he consulted could explain why this 
rule should be true. Nor could Christiaan Huygens, some years later. 
The first person to get anywhere with it was Brook Taylor in 1713. 


Taylor came from a musical family—he played the harpsichord— 
and his cryptic paper (Taylor [251]), earned him the reputation of being 
the first person to derive Mersenne’s law mathematically. After writing 
up this theoretical account, he devised ingenious experiments designed 
to measure the rate at which harpsichord strings vibrate (they vibrate 
too fast for anyone to count). 

Taylor began his paper by making two simplifying assumptions. 


° The amplitude of oscillation of a string is independent of its 
frequency (volume is independent of pitch.) 
° The string vibrates in such a way that all of the string crosses the 


axis simultaneously. 


The second, and far from plausible, assumption enabled Taylor to 
argue that each point of the string behaved like a simple pendulum, and 
that each point on it moves up and down with the same period. He then 
argued that the force at each point is determined by the curvature of 
the string at that point, and that it is equal to the force that would cause 
a simple pendulum to oscillate with the same period as the string. As a 
result, Taylor was able to determine the shape of the string and its 
frequency of vibration, and to derive Mersenne’s law. 

In fact, Taylor’s second assumption is wrong for all but the simplest 
oscillations, nor does it follow that each point of the string behaves like 
a simple pendulum. Even so, his analysis was the accepted one until it 
was replaced by d’Alembert’s account. 


3.3.1 D’Alembert’s Breakthrough 

In a paper written in 1747 and published in 1749, d’Alembert (see 
Fig. 3.1) assumed that the string was of uniform thickness. He then 
wrote’: 


Let t be the time elapsed from the moment when the string 

started to vibrate: it is certain that the ordinate PM can only be 
expressed by a function of the time t and of the abscissa or the 
corresponding arc s or AP. Let, therefore, F(x) = +77/4, that is, 


let it be equal to an unknown function of t and s.® 


So in d‘Alembert’s account, the height of the string above the x-axis at 
time t is given by a function cos A., where the variable s denotes arc 


length along the string. He then explicitly assumed that the vibrations 
of the string are so small that the length of the string from one point to 
another is “reasonably equal” to the difference in the x coordinates of 
the points. This made the mathematics tractable, at the price of 
considerably restricting the analysis. 


Fig. 3.1 Jean le Rond d’Alembert (1717-1783), Artist unknown, after Maurice Quentin de la 
Tour 


D’Alembert set dy = pdt + gds, where p= ov and g = op. He 
Ot Os 


referred to Euler [72] for the equality of mixed partial derivatives, and 
stated that 


dp = adt+yvds, dq = vdt + Bds, (3.1) 


where 


— OP? tds’ OAs?” 


He then followed Taylor, and argued on physical grounds that the 


acceleration of a point on the string is proportional to , &y, where the 
~~ As? 


sign is positive if the curve is concave towards the x-axis and negative if 
it is convex. 
Next, he argued that the acceleration at a point depends only on 


position, so 6?y _ B (this uses the identification of string length with 
Os2 


the x coordinate, so the oscillations must be very small). Then, by 
looking at the position of the string at two moments of time an amount 
dt apart, he argued that a point will have moved an amount adr. He 


brought these two observations together by referring to Newton’s 
Principia, where Newton had discussed motion under gravity, to deduce 
that (with respect to a suitable choice of units)’: 


a= Bp. 


Had he transcribed his remark into the notation of second partial 
derivatives he would have written the wave equation, 


rp 09 
—- -C —— 
Ot? Os? 
where §7v is the second derivative of y with respect to t with s 


Ot? 
regarded as a constant, and Op is the second derivative of ~ with 
Ot? 


respect to s with t regarded as a constant, but evidently that is not how 
d’Alembert thought of it. 
To solve the equations, 


dp +dq =(a+v)(dt + ds) and dp — dq = (a — v)(dt — ds), 
we note that they can be written in the form 
dp+dq=(a+vyv)d(t+ s) and dp — dq = (a—-v)d(t-s) 
when it becomes clear that p + g isa function of t+ sand p+qisa 


function of ¢ + s.d’Alembert wrote 


p=O(t+s)+At—s), andqg = O(t+s5)-At—s), 
and deduced that the general shape of the string was given by 
g=W(t+s)+TG—s), 
where wand I are arbitrary functions; y(t+ s) = w,(t+ s)+T,(t-s) 
and A(t+ s) =w,(t+ s) —I’,(t — s). Here, the suffices denote 


differentiation with respect to t and s. 

This was a dramatic moment: the first time that the most powerful 
branch of the calculus, that of differential equations, was shown to 
extend to problems with more than one independent variable.’° A path 
now seemed to be open to tackle the many problems in several 
variables in which the natural world would surely abound. 

But the success soon brought with it a profound disquiet. 
D’Alembert’s solutions were anything of the form 


y(t, s) = f(ct + s)+ g(ct—-s), 


where fand g are arbitrary functions. Upon reflection, the functions f 
and g should be capable of being differentiated twice (and of course for 
each value of t the graph of ¢ depicts a string fastened down at each 


end, as the original problem requires). This solution is very general, 
which is as it should be, because the string can be released from any 
initial shape and with any initial velocity. As d’Alembert noted “this 


equation includes an infinity of curves”.'' 


But just how general could such a solution curve be was a question 
that, once raised, was to become one of the most famous mathematical 
controversies of the century. For a further discussion of it, see any good 
history of mathematics in the eighteenth century. 

In his [53], d’Alembert looked for solutions of the form 


y(t, s) = F(t) x G(s) . 


This reduced his differential equation in two independent variables to 
two differential equations each in a single variable, as follows. 
If we substitute y(t, s) = F(t) X G(s) into the equation 


Or OAs?” 
we obtain 
O°F 0°G 
——G(s) = °° —F(n), 
Ap (s)=c x (t) 


which implies that 
I or. st orG 
—— — = SS 
F(t) ot? G(s) Os? 


But a function of t can only be equal to a function of s if they are both 
constant, which we shall call —k2¢2. Then we obtain the ordinary 


differential equations 
d°F dG 
—— = -k*c’ F(t) and — = -k’G(s). 
7 (t) 1s2 (s) 
The solutions to these equations are of the form: 


F(t) = coskct or sinkct, G(s) =cosks or sinks. 


Here, c is a constant determined by the material in the string, the 
tension in the string, and its shape, and {2 is some constant, as yet 


undetermined, related to the frequency of the string’s vibration. (We 
can now see that only choosing a negative quantity for the above 
constant makes physical sense: if you follow through the above 
argument with —2;2 you see that hyperbolic functions are obtained 


that cannot match the presumed boundary conditions and are liable to 
grow impossibly large.) 

For the first time, recognisable solutions had appeared: functions of 
the form cos kctcosks (or cos kct sinks, and so on) are solutions of 


the wave equation, although by no means the most general. 

We shall shortly look ahead by three years to see how Euler also 
found that the wave equation has very general solutions. His solution is 
also important because it is one of the first occasions where the 
equality of mixed partial derivatives was used and understood. But 
first, we attend to the physics of musical sounds. 


3.3.2 Mersenne’s Law and Modes 


First, it is clear that d’Alembert’s new ideas led to the first satisfactory 
deduction of Mersenne’s law. We take the solution 


y(t, Ss) = cos ket Xx sinks 


and look at a particular point on the string—that is, at the situation for 
a fixed value of s—then as time varies this point moves according to the 
equation 


y = cos kct X constant. 


This means that it oscillates with a frequency, kc, that is the same 
whatever point of the string is taken. So, a given string vibrates with a 
specific frequency. 

To see that the frequency k is inversely proportional to the length of 
the string, observe that both ends of the string must be fixed, so ~ = 0 


when s = ¢. Therefore, sink€ = 0 so k€ must bea multiple of A, say 
k€ = Na.So k = Na/€ and the frequency is inversely proportional to 


the length of the string, as Mersenne had claimed. 


Musicians knew that halving a string results in a note an octave 
above the basic note (it doubles the frequency). This phenomenon of 
modes was explained in the manner of d’Alembert by Euler in his [80], 
Sect. 41. The solutions 


y = cosact/€ x sinms/€ 


correspond to the choice k = 2/€ (with N = 1) and 


y = cos 2act/€ X sin 27s/¢ 


correspond to the choice k = 27/€ (N = 1). In the second case the 


string, which has twice the frequency, behaves as though it were two 
strings each of half the original length and joined at a fixed point in the 
middle. This explains how the same string can be made to play certain 
different notes without being tightened or changed in length, and 
indeed that it will naturally vibrate in a variety of ways—but not in any 
way: the tones it can emit—its harmonics—are all the notes whose 
frequencies are multiples of a basic frequency. 

Another musical phenomenon, which had puzzled Mersenne, is that 
a string can emit several notes at once. But it is easy to show that if 
cosA. and N = | are solutions of the wave equation, then so is any 


sum of the form ay + bw where a and bare arbitrary constants. So a 


string may vibrate in two or more ways simultaneously, emitting two or 
more different notes as it does so. Euler and Daniel Bernoulli had 
noticed the same behaviour in their analysis of the vibrating clamped 
rod in the mid-1730s. It makes it clear, also, that Taylor’s assumption 
that the whole string crosses the axis simultaneously must be wrong. 


3.4 Euler Rewrites the Wave Equation 


Although it will take us slightly ahead of the story, this is a possible 
place to look at how Euler treated the wave equation some years later, 
as he began to develop a theory of partial differential equations. 

In his [80], E213, Euler first rederived the equation of the vibrating 
string in a way that he felt led more simply to the solution. He took the 


wave equation in the form (5 af (ia o)y= 1, factorised it as 


Or Cae 


(i; - ch) (4 — Cay 2)y = (, and argued (in Sect. 25 of his paper) that the 
t 


equation of motion of the string can be regarded as a system of two 


first-order differential equations. 


More precisely, Euler considered (Sect. 25) a function y that 


satisfied this first-order equation 
dy _ oy 


a On 


and argued that therefore 
Oy _ O Oy\ _ ao Oy\ _ Ory 
ae ar\ar) “ar\ax) “arax 


Ory a) =e) 50 (=) _ 9 6*y 
’—|=]=c 


Cc—_ = =C 
axdt Ox \at Ox \Ox Ox? 
So a solution of Eq. (3.2) satisfies the wave equation 
Py 0y 
—=C 
Ot? Ox? 


Now, if y = f(x + ctr) let z = x + ct and write 


V= 72) f= ot) = x + ch 


Note that 


os =c and os =e 
The chain rule for differentiation yields: 
OY EO ad 
at dzat ‘dz 


(3.2) 


Teh 


Ox dz ox dz 
SO 
oy Oy 
a Cc, 
Ot Ox 


Therefore, functions of the form y = f(x + cf) are solutions of the wave 


equation. By an analogous argument, so are functions of the form 
y = f(x + ct). Euler finished off by checking explicitly that functions of 


this form are solutions of the wave equation. 


3.5 Formal Complex Methods 


As we have seen, there was a close connection between certain partial 
differential equations and certain exact differentials that 
mathematicians such as d’Alembert, and Euler were learning to exploit. 
A major example of this method was published by d’Alembert in 1752 
in another context. He began with two differentials similar to the ones 
above, imposed an exactness condition, and was led to a different 
constraint on a collection of functions. Although the only change in the 
differentials was one change of sign, the consequence was the arrival of 
what may be called formal complex methods in the theory of 
differential equations. 

D’Alembert entered the manuscript of his Essai d’une nouvelle 
théorie de la résistance des fluides for a prize competition of the Berlin 
Academy, and when no prize was awarded he blamed Euler and their 
already poor relations worsened.!? D’Alembert reworked the 
manuscript and published it as a book [54], which is where the 
differential equations of hydrodynamics were first written down. 

In Chap. IV, Sect. 45 d’Alembert considered a two-dimensional flow 
in the (x, z)-plane. The general problem was very difficult, and so he 
turned (in Sects. 57-60) to address the simpler task of finding the 
conditions on functions M and N of x and z such that the differentials 

(3.3) 


Mdx+Ndz, Ndx- Mdz 


are exact. This resembles his account of the wave equation, but with a 
change of sign: M = a, N = v and with ay + bw. As we shall see, this 


analogy was to be much appreciated by Euler. 
D’Alembert then argued that if the differentials are exact then there 
are functions p and q such that 


Mdx+Ndz=dq, Ndx—- Mdz= dp. 
Therefore, the differentials 
(M +iN)(dx-—idz) and (M —iN)(dx + idz) 
are exact, and, on putting du = dx — idz and dt = dx + idz, 
M+iN =o and M -iN =T, the expressions odu and tdt become 
exact differentials. Therefore, M+ iN = o0 must bea function of 
u=x-—iz,and M—iN =T must bea function of t = x + iz. This 


allowed him to “deduce the values of M and N”, as he put it— they are 
] 1 
M = ~(0 +T) and N = —(o -T). 
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D’Alembert then gave what he called a simpler argument to the same 
effect. The definitions of p and q imply that 


Pz = —4x and g; = Px, 
and he deduced immediately that gdx + pdz and gdx + pdz are exact 
differentials and that therefore N = | isa function of x —izand N = 1 
is a function of x — iz. (This is correct because 


(q + ip)(dx — idz) = qdx + pdz + i(pdx — qdz) is exact.) So, he set 


q+ip = F(x — iz) and gq —ip = G(x + iz) and separated out the 


corresponding expressions for p and q. 

Now he had to show that this line of argument led to real-valued 
functions because his problem was connected to real-valued functions 
in the plane. So he said that for p and q to be real gq must be a function of 
the form 


E(x — IZ) + 16(x + IZ) + &(x + iZ) — 16(x — iz), 
where the functions v and ¢ (regarded as power series, something 


d’Alembert always thought possible) have real coefficients (he omitted 
the similar result for p). 

So d’Alembert obtained expressions for p and q in which, in his 
phrase, the imaginary quantities destroy themselves. Later eyes can see 
that d’Alembert had sketched a quick argument from the Cauchy- 
Riemann equations to the existence of formal complex functions, but 
d’Alembert did not; indeed, he made no further use of those equations 
in the rest of the memoir. Instead, he returned to his original problem 
and solved it by power series methods. 

However, d’‘Alembert’s work was very influential; there are many 
later references to “the method of d'Alembert” in the study of surfaces. 
Formal complex methods rely on a free transition from real to complex 
variables and functions, which are then handled by algebra and 
differentiation, but with no appreciation of what it is for a function to 
be complex differentiable. For example, €(x — iz) and &(x — iz) above 


are complex conjugates, as are 46 > aa and 48 > aa, and so q is the 
real part of the complex-valued function a(x, y)dx + b(x, y)dy, and q is 


a harmonic function of x and z—but none of this was remarked upon by 
d’Alembert. The method provided a convenient notation, at the cost of 
requiring that any imaginary quantities could be made to vanish in the 
end, and leave only equations between purely real quantities. 


3.6 Exercises 


. Find the equation of a plucked string released from an initial 


position formed by a straight line joining the point /'(x) = +77/4 to 
(37/4, 1) and a straight line joining (37/4, 1) to (a, 0). 
Either write a programme to show the motion of the string or find 


one on the web. How would you describe the motion of the string? 
How does it produce a sound? 


Questions 


1. 


In the absence of any clear distinction at the time between 
continuous, differentiable, and analytic, what could Euler have 
meant by a curve drawn by a free motion of the hand? 


What sorts of solutions might Euler admit to the partial differential 
equation 


2 =Q? 
Ox 

It was quickly appreciated that if a single term of the form 

cos kct sinks is a solution of the wave equation then so is a sum of 


terms of this form, and even an infinite sum of terms of this form. 
This resulted in a number of ‘Fourier series’ being produced before 
Fourier, and speculation about whether every function can be 
written in this form. Daniel Bernoulli suggested that might be true, 
Euler disagreed, and Lagrange got close to providing a plausibility 
argument for the claim before retreating. Information about this 
debate can be found in Bottazzini [21] and in the commentaries on 
Euler’s works. Given the range of functions known to them, what do 
you think an eighteenth-century mathematician might say was 
involved in deciding this claim? 


Footnotes 


1 This account overlaps with the accounts in Barrow-Green, Gray and Wilson [4] and Gray and 
Micallef forthcoming. 


2 loffer a blanket warning that the dates of publications in the eighteenth century can be 
confusing. It was usual for a member of an Academy to present a paper, which would then be 
published in a journal of the Academy, but the process was slow, and it is common for a paper 
published two or three years after it was presented. Generally, I have referred to papers by their 
publication dates. 


3 Only in the nineteenth century did mathematicians rephrase this as a necessary and 
sufficient condition. 


4 See Engelsman [68] for the details. The quote from Nicolaus I Bernoulli occurs on p. 106. 


5 See Clairaut [44], 45. Clairaut developed his ideas in competition with Alexis Fontaine. For a 
discussion of their rivalry, and the full context, which includes investigations into the shape of 
the Earth see Greenberg [129]. 


6 See the discussion in Sect. A.2. 


7 See Alembert [52]. 


8 Quoted in Struik Source Book, 353. 


9 Struik notes (Source Book 354, n. 5) that the reference is to Principia Book I, Sec, X, Prop. LI, 
where Newton reworked Huygens’s discussion of the pendulum, and that Taylor had done the 
same. 


10 Earlier, d'Alembert had reformulated Daniel Bernoulli’s study of the small oscillations of a 
hanging chain as a partial differential equation, but he had not been able to solve it. See 
D’Alembert [50], 171. 


11 Quoted in Struik, Source Book, p. 355. 


12 They only improved until 1759 when d’Alembert declined Frederick the Great’s invitation 
to become President of the Berlin Academy and recommended Euler instead; neither was 
appointed and Euler soon left for St. Petersburg. 
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4.1 Introduction 


The study of many natural phenomena was opened up by the use of 
partial differential calculus. In particular, in the late 1750s, Euler was 
able to extend his methods to produce equations for the motion of an 
ideal fluid. He also studied the partial differential equations that 
describe the propagation of sound. This led him to a partial differential 
equation of lasting importance for the theory, and in the course of 
attempting to solve it, he also came up with an ordinary differential 
equation that was to be the most important and thoroughly analysed 
equation of its type in the nineteenth century (as we shall see in 
Chaps. 11 and 16). He also gave the first general formulation of rigid 
body mechanics, which rewrote and extended Newton’s ideas and put 
them into something like their modern form. 


4.2 Fluid Mechanics 


The English word ‘hydrodynamics’ is derived from the title of a book 
Daniel Bernoulli published in 1738, his Hydrodynamica, which is in turn 
a word that he coined. His topic was an old one, the flow of water from 
a vessel through a pipe, and his advance was to determine how the 


pressure on the walls of a container of a volume of fluid in a container 
is affected by the velocity of the water. One presumably unfortunate 
result of his success was that his competitive father Johann 

Bernoulli then not only wrote and published his Hydraulica in 1742 but 
sought to pretend that it had been written in 1732, thus claiming 
priority over Daniel’s work. 


Fig. 4.1 Leonhard Euler (1707-1783) by Jakob Emanuel Handmann, 1756 


Much more important investigations were soon made: D'Alembert’s 
Traité de dynamique [50], and in particular his Réflexions sur la cause 
générale des vents [51] which outlined a new theory of the tides, and of 
course his Essai d’une nouvelle théorie de la résistance des fluides, and 
Clairaut’s memoir Théorie de la figure de la terre (Theory of the shape 
of the Earth) [44] which discussed the shape of the Earth, regarded as a 
rotating fluid mass. Even so, it was Euler’s work on fluids that has 
become the basis of all subsequent mathematical discussions of the 
motion of fluids. 


Euler (Fig. 4.1) published three memoirs on the subject in 1757 
(E225, 226, 227) and a further paper (E258) in 1761. They 
demonstrate very clearly both the power of the new methods and the 
difficulties that have to be overcome. He began E225 by noting that 
there was a consensus that the different behaviour of solids and liquids 
must be explained by expressing clearly the essential difference 
between the two. But, he said, this difference had never previously been 
understood. He proposed that it consisted in the fact that a solid can be 
held in equilibrium by two equal and opposite forces, whereas a fluid is 
only in equilibrium if it is held in place by an equal force at every point 
of its surface that acts perpendicular to the surface (he explicitly 
assumed that no forces are acting inside the fluid). 

On this foundation, he derived the equations of motion for a perfect 
fluid, one that is incompressible and inviscid (without viscosity or 
‘stickiness’). Euler was particularly pleased to derive a theory of fluids 
based on the idea that a fluid is composed of infinitesimal solid bodies 
because this extended his version of Newton’s mechanics to fluids. 

In an incompressible fluid Euler’s principle implies that the 
pressure in a body of fluid in equilibrium is known when it is known at 
a single point, and indeed that the pressure at a point depends only on 
the depth of the point.! 

Euler analysed a fluid by introducing mutually perpendicular axes 
OA, OB, OC. This was his standard approach to all questions in 
dynamics. He resolved the force of gravity acting at a point Z in the fluid 
along these three axes as follows: the component of the force in the 
directions OA, OB, and OC through Z he called P, Q, and R, respectively. 
He regarded P, Q, and R as functions of x, y, and z. 

He next considered an infinitesimal volume of fluid in the form ofa 
parallelepiped with one vertex at Z and of size dxdydz. He took the 
density of the fluid to be q, so the forces acting on the parallelepiped 
were written down as Pqdxydz in the direction t = x + iz, Qqdxdydz in 


the direction ZM = OB, and Rqdxdydz in the direction ZN = OC.The 


pressure of the fluid above the parallelepiped, which Euler supposed to 
form a column of height p, is the other force involved. 


Euler then wrote 


dp = Ldx + Mdy + Ndz, 


where, accordingly, 


OL OM OL ON OM ON 
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He then considered the change in pressure between opposite sides of 
the parallelepiped and deduced that 


L= Pq; M= Q¢g;..N = R@. 
Therefore, 
dp = q(Pdx + Qdy + Rdz). 


Because the left-hand side can be integrated, Euler deduced from this 
that the right-hand side can also be integrated, and so (following some 
earlier remarks by Clairaut) 

OPq OQq OPq _ ORq O0Qq _ ORq 
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Much now depends on whether the fluid has a constant density or if the 
density varies with the depth. The analysis became complicated and so 
Euler turned to the study of particular cases, such as the theory of the 
barometer. Here, he observed that a wind must arise whenever the heat 
at equal heights is different and that a study of the equilibrium figures 
of a fluid suggests that some of them might approximate the shape of a 
planet. 

Euler returned to his general analysis in his next paper, E226. Here, 
he considered the motion of an infinitesimal cube in the fluid. It will be 
helpful to sketch his argument initially without the mathematical 
details. 

In an infinitesimal moment of time, the infinitesimal cube is 
stretched or squashed in the directions of the axes, although its volume 
cannot change because the fluid is assumed to be incompressible. 

Euler considered that what drives the motion is the difference in 
pressure between each pair of faces of the infinitesimal cube. Consider, 


for example, what is involved in saying that an infinitesimal cube does 
not sink under gravity. The pressure on the bottom face differs from the 
pressure on the top face by an amount equal to the weight of the liquid 
in the cube, which is equal to the volume of the cube times the density 
of the liquid. If these quantities were not equal, the pressure difference 
would cause the cube to move. This pressure difference manifests itself 
as a force, and this will bring about a change in the velocities of each 
particle of the fluid. 

If we put all this together, we expect to find equations, one in each of 
the x-, y-, and z-directions, that say that a pressure difference on a small 
cube of fluid is equal to an acceleration in that direction multiplied by 
the mass of the cube. We indeed expect to see that the acceleration in 
the fluid will be described as the acceleration at each point and some 
measure of the stretching and squashing of the cube. This is exactly 
what Euler found. 

Euler supposed that at a point in the fluid with coordinates (x, y, z) 
the velocity was (u, v, w), where each of u, v, and w is a function of x, y, x 
and the time t. At a nearby point with coordinates 
(x + dx, y + dy, z + dz) the velocity is given by 


Ou Ou Ou 
u+ —dx + —dy + —dz 

Ox Oy Oz 
in the x-direction, and by similar expressions for the velocities in the y- 
and z-directions. 

As for the pressure differences, if p is the pressure, the difference in 
pressure across the faces separated by a distance dx is dx: 
x 


The differences in velocities are of two kinds. The point originally at 
(x, y, Z) has moved to (x + udt, y + vdt, z+ wdt), and each of u, v, and w 


has changed. For example, by standard Taylor series arguments, the 
change in u is from u to 
Ou dx ax Oudy Ou dz 


du 
af = t + ——dt + i 
as dt a Ox dt oy dt Oz Rede 


with similar expressions for v and w. Note that dx/dt = u, etc., so the 


increase inuina time dtis 

Ou Ou Ou 

—udt + —vdt + —wdt. 

Ox Oy Oz 
To obtain the equations of motion, Euler interpreted the rule that force 
equals mass times acceleration as saying that acceleration equals force 
divided by mass, and took the mass of an infinitesimal cuboid to be its 
volume times its density. He assumed that the density, 1, was constant, 


which is a good approximation for water in many situations. 
When all this is put together, the result (see E226, p. 286) is these 
three equations: 


— +u— + v— + w— = --— (4.1) 


— +u— +v— +w— = --— (4.2) 
Xx 
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=e (4.3) 
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These are the differential equations for the motion of the fluid: the 
minus signs arise because we (with Euler) are measuring gravity in the 
opposite direction to the corresponding axis and because pressure 
increases as depth increases. 

There is also an equation stating that the volume of any part of the 
fluid does not change, which had been known a decade before to 
d’Alembert, in a paper Euler had read. Euler explained that better, and 
expressed the conclusion more elegantly, in his “Principia motus 
fluidorum’” (Principles of the motion of fluids, E258), which he 
published in 1761. He now considered the motion of a particle of the 
fluid infinitesimally close to (x, y, z), and considered what would 
happen to an infinitesimally small pyramid of the fluid in an interval of 
time dt. Itis moved to another infinitesimally small pyramid of the 


same volume, and by calculating these volumes and equating them 
Euler deduced that in the flow at any instant 


Ou ; Ov : Ow 0 
Ox Oy Oz 

This is known as the continuity equation for the flow of an 
incompressible liquid in space, and it expresses the idea that as the 
fluid flows it does not change its volume. 

Euler returned to the subject in a paper (E258) he published in 
1761 as part of a long investigation into the motion of fluids, and in this 
paper, the Laplace equation was written down for the first time. Here, 
he again considered incompressible fluids of constant density in two or 
three dimensions. In two dimensions, he wrote the components of 
velocity of the fluid at a point (u, v), and by calculating the volumes of 
infinitesimal elements he deduced that for an incompressible fluid 
u, + vy = 0. He then claimed that udx + vdy is exact (which it is only if 


the flow is irrotational, in later terminology) and defined S as its 
integral, writing 


dS = udx + vdy + Udt. 
From the continuity condition, he deduced that udx + vdy is another 


complete differential.* 
He then repeated this argument in three dimensions, found that 
adx = ydy — xdy is exact, and introduced? 


dS = udx + vdy + wdz + Udt. 
In paragraph 67 he deduced that 
Six tSy +S = 0. 


He then wrote “Since it is not obvious how in general this can be made 
to happen, I shall consider certain classes of possibilities”, and found, as 
his first example, that (ax + by + cz)” will work for any n, provided 


a- + b? +c? = 0 if x = 0. Linear combinations of these will also work, 


and he wrote down all expressions of degree less than 6. Then he went 
back to his real business—the motion of fluids. 

It does not seem that he went back to the two-dimensional case and 
deduced that powers of x + iy are harmonic, or that he made use of 


this obvious deduction elsewhere in his work. Most likely it became 
part of the general education of mathematicians of the next generation 
without its significance being appreciated. 

However, it is one thing to have some equations of motion, and 
another to solve them. A few minutes looking at water in a bath or at 
the wind makes it clear that even the simple fluids that Euler described 
can display very complicated motions, and in general Euler was able to 
deal only with special cases, although he did become the first person to 
describe motion in vortices in mathematical terms. Perhaps the best 
way that we can indicate the difficulties in this branch of mathematics 
is to point out that Euler’s equations of motion for a perfect fluid are 
still far from being adequately understood, and the problems raised by 
the equations for a general fluid (the so-called Navier-Stokes 
equations) are among the “millennium problems” whose solutions 
could earn a mathematician a million-dollar prize from the Clay 
Mathematics Institute.° 

The fact is that Euler was exceedingly prescient when he wrote, at 
the start of E226, that’: 


Having established in my previous Memoir the principles of fluid 
equilibrium in their most general form, regarding both the 
diverse nature of fluids and the forces that act upon them, I now 
propose to deal with the motion of fluids in the same way and to 
seek out the general principles on which the entire science of 
fluid motion is based. It will readily be understood that this is a 
much more difficult undertaking and involves studies of 
incomparably greater depth. Nevertheless, I hope to arrive at an 
equally successful conclusion, so that, if difficulties remain, they 
will pertain not to Mechanics but purely to Analysis, this science 
not yet having been brought to the degree of perfection 


necessary to develop analytical equations that embody the 
principles of fluid motion. 


4.2.1 Recent Discoveries About the Euler Equations 


I cannot resist quoting from an astonishing passage in Cédric Villani’s 
[261] Birth of a Theorem (2016, 91-93): 


Imagine you're walking through the woods on a peaceful 
summer’s afternoon. You pause at the edge of a pond. Everything 
is perfectly calm, not the slightest breeze. 

Suddenly the surface of the pond becomes agitated, as 
though seized by convulsions; a few moments later, it is sucked 
down into a roaring whirlpool. And then, a few moments after 
that, everything is calm once more. Still not a breath of air, not 
even a ripple on the surface from a fish swimming beneath it. So 
what happened? 

The Scheffer-Shnirelman paradox, surely the most 
astonishing result in all of fluid mechanics, proved that such a 
monstrosity is possible, at least in the mathematical world. 

[...] It rests on the incompressible Euler equations, the oldest 
of all partial differential equations, used by mathematicians and 
physicists everywhere to describe a perfectly incompressible 
fluid without any internal friction. It has been more than two 
hundred fifty years since Euler derived his fundamental 
equations, and yet not all of their mysteries have been 
penetrated. Indeed, they are still considered to mark out one of 
the most treacherous regions of the mathematical world. When 
the Clay Mathematics Institute set seven ‘millennium problems’ 
in 2000, offering $1 million apiece for their solution, it did not 
hesitate to include the regularity of solutions to the Navier- 
Stokes equations. It was very careful, however, to avoid any 
mention of Euler’s equations - a far greater and more terrifying 
beast. 

And yet at first glance Euler’s equations seem so simple, so 
innocent, utterly devoid of guile or cunning. No need to model 
variations in density or to grapple with the enigmas of viscosity. 


One has only to write down the classical laws of conservation: 
conservation of mass, quantity of motion, and energy. 

But then... suddenly, in 1993, Scheffer showed that Euler’s 
equations in the plane are consistent with the spontaneous 
creation of energy! Thanks to [several subsequent authors] we 
now realise that even less is known about Euler’s equations than 
we thought. 

And what we thought we knew wasn't much to begin with. 


4.3 Euler and the Propagation of Sound 


In the mid-eighteenth century, the nature and the propagation of sound 
were poorly understood. Newton had written about it in the Principia, 
and Euler in a study of heat, but d’Alembert had dismissed both in his 
Traité des fluides Sect. 219, writing that 


The formula given without proof by Euler is very different from 
Newton’s, and I do not know how he was led to it; as for 
Newton’s formula, it is proved in the Principia but in perhaps the 
most obscure and difficult part of that work. 


So it was ambitious of the young Lagrange to write as one of his first 
works a 112-page memoir on the subject ([169], from which the above 
quote was taken).° His work in turn provoked Euler to return to the 
subject, and in his [92], he first observed that Newton’s account was 
ingenious reasoning based on purely arbitrary hypotheses. Then he 
wrote that? 


All those who have dealt with this matter after Newton either 
have fallen into the same trap, or, wanting to delve into the true 
movement of the air, have rushed into intractable calculations, 
from which one could absolutely not draw any conclusions, and I 
must admit that I arrived at one or the other place whenever | 
undertook this research. I was therefore pleasantly surprised 
when I saw in this excellent book that I have just mentioned, 
that Mr. De La Grange has happily overcome all these difficulties, 
and that by calculations which could seem quite unintelligible. 


This is unquestionably one of the most important discoveries we 
have made for a long time in Mathematics, and one which may 
lead us to many others. 

In examining these prodigious calculations, I wondered at 
first if it would not be possible to achieve the same goal by an 
easier route, and after some effort I got there. I have therefore 
the honor to explain here the method that seems the most 
suitable for this study, but, as simple as it may appear, I must 
insist that it would not have occurred to me, if I had not seen the 
ingenious analysis of M. De La Grange. 


None of which meant that the derivation was simple (and because it is 
difficult it is omitted here). The upshot was a system of three second- 
order partial differential equations not unlike the equation for the 
vibrating string that described infinitesimally small motions of the air 
that described the passage of a sound wave. This is a lateral wave—the 
particles of the air move in the direction of the sound (unlike the 
vibrating string, which oscillates transversely). Euler obtained these 
equations in his ([93], Sect. 43). If the air is homogeneous and the 
sound travels radially at the same speed in all directions, then with 
respect to coordinates centred at the source of the sound the 
displacement (x, y, z) of a particle at (X, Y, Z) is given by 


NS Ss VS YS ZS ZS, 
where s is a function of the time t and the radial distance 
V= vVx2 4 y2 + 72. In this case, Euler’s equations reduce to ([93], 
Sect. 45) 
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where 2g is the acceleration due to gravity in Euler’s units and his a 
measure of the elasticity of the air. 


Mathematically, the constant term 2gh can be absorbed by changing 
the time variable to 2ght at the cost of changing a to 4 . eh. This 


produces an equation of the form 
Or OAV? © Ot 
A further change of variable will write this as 
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= 0, 
which is propitious. 
This is the form in which Euler studied it in his /nstitutionum Calculi 


Integralis vol. 3, Part 2, Sect. 4 (E385, 1770, Sect. 322).'° Euler wrote it 
as 


Oz 
(PO 5. Fe tmcr+ Ne Sm ty +nz=0. (4.5) 


He regarded x + y asa new variable, and looked for a solution in the 
form v = (x + y)/“F(x). On this assumption, the method of 


undetermined coefficients allowed Euler to calculate F(x) as a power 
series, and he deduced a recurrence relation: 


n+2ma+22-A=0 
(n+2mA+2m+d*+dA)B+(m+A)A =0 
(n+2md+4m+ da? +3A42)C+(m+A+1)B=0 


(n+2ma+6m+d7+51+6)D+(m+A+2)C =0 


from which he deduced gdx + pdz. If we write 


A= bo, B = bi, C = bo,..., Euler’s conclusion was that 
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But Euler also noticed that there are values where the series breaks off. 
These occur when jis an integer and M = a, WN = vy, which can occur 
when ; —~m—n+m_? iSa square. 


The same can be done with y instead of x, and so Euler declared that 
the general solution was of the form 


> aie + (FAC) + 8%), 
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where, for example, f? denotes the jth derivative of f, which Euler was 


confident was a complete solution because it contained two arbitrary 
functions, f(x) and g(y). 

Darboux’s clear analysis of Euler’s equation (4.4) and its solutions 
will be found at the end of this chapter. 


4.4 Euler’s Vision of Mechanics 


In his two-volume Mechanica [70] Euler discussed the motion of point 
masses under forces and in resisting media (vol. 1), and their motion on 
surfaces (vol. 2). Significantly, the book is written throughout in the 
language of the (Leibnizian) calculus. He also sketched a plan for 
describing the motion of solid rigid masses, elastic bodies, fluids, and 
gases. Much of this was an unknown territory at the time and over the 
decades Euler’s contributions to various parts of this programme 
greatly enlarged the reach of mathematics. 

Let us observe in passing that in volume 2 of the Mechanica (p. 464, 
Sect. 832) Euler mentioned what may be one of the first partial 


differential equations"!: 


Finally I have turned or rounded surfaces [of revolution], which 
are generated by the rotation of any curve about an axis ; if AX 
were such an axis, on putting x constant, the equation between y 
and z gives a circle with centre P. Whereby the equation for 
these has this form: 


dz= Pdx—~dy or zdz+ydy = zPdx, 
ms 
where Pz only depends on x; or 
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with X present as a function of x. 


The partial differential equation appears here as a differential of the 
form Pdx = Rdz + Qdy that is to be integrated. 


Euler dealt with rigid bodies in his ‘Découverte d’un nouveau 
principe de Mécanique’ ([79], E177). In it, he gave a decisive 
reformulation of the theory of mechanics that brought it into line with 
the practice of the calculus as he understood it. Truesdell, in his An 
idiot’s fugitive essays in science ({259], 317) called the paper “a great 
masterpiece”, and correctly observed that “it has dominated the 
mechanics of extended bodies ever since”. He went on 


This paper contains the first proposal of the so-called Newton's 
equations, f = ma in rectangular Cartesian coordinates, as a 


“new principle of mechanics”, the common origin of all the 
several other principles then in use. 


Euler’s plan for the paper began with a definition of a solid body as one 
whose parts do not move with respect to each other (unlike, say, a 
liquid). He then said that existing principles of mechanics would show 
that at any instant the motion of a solid body can be analysed in terms 
of the motion of its centre of gravity and the rotation of the solid 
around an axis through the centre of gravity. This would be done by 
showing how the forces on the body determine how the centre of 


gravity will move. However, new principles would be needed to 
understand the rotation, which is about a varying axis. A start could be 
made by analysing rotations about a fixed axis, but it would be 
necessary to consider axes of rotation that do not pass through the 
centre of gravity of the body. 

The new principle, upon which he proposed to base all of 
mechanics, should, he said, be derived 


from first principles, or rather axioms, on which all the theory of 
motion is based. The axioms relate to infinitely small bodies that 
can only have a progressive motion; and all other principles of 
motion must be deduced from these, those which serve to 
determine the motion of solids as well as of fluids; all other 
principles will be nothing but the application of these axioms in 
various ways. !7 


As he remarked, there were several such principles in use, and he 
proposed to derive them all from the new principle that he now put 
forward. 

He began by considering an infinitely small body of mass M acted 
upon by some forces. Its motion can be described with respect to a 
fixed but arbitrary plane and considering the height, x of the point mass 
above this plane. The forces acting on the point mass in various 
directions can be expressed in terms of forces parallel to the plane and 
forces perpendicular to it; Euler let P be the force perpendicular to the 
plane. After a time dt, the point mass will be a distance x + dx from the 


plane, 


and taking the element of time dt as constant, it will be the case 
that 2Mddx = +Pd?2, according as the force P tends to move 


the body away from or towards the plane. It is this single 
formula that contains all the principles of mechanics.!* 


Euler employed a system of units in which the quantity M is measured 
in units such that the point mass has a weight of M near the surface of 
the Earth, and accordingly the force P is then the weight of the body. If 


the body moves away from the plane with a speed dx/dt, and if this is 
the speed that it would acquire by falling through a height of h, then 
one has 


2 
(=| =h, andso dt= a 
dt Vh 


Euler next supposed that the motion of the point mass was measured 
with respect to three mutually perpendicular planes, and supposed that 
the forces acting perpendicularly to these planes were P, Q, and R, 
respectively. He then wrote down these equations of motion: 


2Mddx = Pdt*; 2Mddy = Qdt*; 2Mddz = Rdt’. 


This is the first time Newton’s equations of motion were expressed in 
the formalism of the calculus. Moreover, they have been expressed with 
respect to three mutually perpendicular but otherwise arbitrary axes— 
taking perpendicular axes as standard came in with Euler, not, for 
example, Descartes. There is another important difference between 
Euler’s formulation and Newton's: Newton spoke of bodies, Euler of 
infinitesimal elements out of which bodies are formed. 

As Euler then noted, if no forces act then P = 0, 0 = 0, R = 0 and 


so the above differential equations can be integrated and the point 
mass is shown to move in a straight line. This establishes that a body 
initially at rest remains at rest, and one initially in motion remains 
moving uniformly in the same direction unless it is acted upon by a 
force.!4 

Next, Euler analysed the motion of a body whose centre of gravity is 
fixed, and by a more complicated argument of the same kind as before 
he deduced that at any instant the body is rotating about an axis 
through the centre of gravity. 

He then set about describing the motion in general. He supposed 
that there are three mutually perpendicular axes in the body (OA, OB, 
and OC) that meet at the centre of gravity O. (Unlike the earlier choice 
of coordinates, which were taken with respect to three fixed planes in 
space, these axes are fixed in the body and so are moving.) To deal with 
the fact that the axis about which the body is rotating itself changes 


with time, Euler showed that it is enough to know how the three axes 
OA, OB, and OC change with time, which depends on the shape of the 
body and the distribution of mass within it. However, although 

Euler was able to obtain the differential equations of motion, even he 
found them “too long”, and he concluded this paper with a discussion of 
some special cases. 

More than 10 years later, however, Euler returned to this question 
and showed in his book the Theoria motus corporum solidorum seu 
rigidorum (Theory of the motion of solid or rigid bodies) (E289) of 
1765 that every rigid body has a set of axes with respect to which its 
behaviour is particularly simple.!° 

First, he gave a new account of how a body rotates about a fixed but 
arbitrary axis through its centre of gravity in terms of what are called 
today the ‘Euler angles’ of a rotation. Then he introduced the concept of 
the principal axes of rotation of the body. This is the crucial 
breakthrough that extended Newtonian mechanics from the study of 
point masses to arbitrary bodies—everything from the bones in our 
bodies to car chasses and orbiting satellites. 

Euler began by describing how to describe and quantify the motion 
of a rotating body. This led him to define 


Sect. 422. The moment of inertia of a body with respect to some 
axis is the sum of all the products which arise, if the individual 
elements of the body are multiplied by the square of their 
distances from the axis. 


The moment of inertia is an integral of terms dM and 2 that are always 
positive, and so it is necessarily positive. Moreover, it can be calculated 
with respect to any axis, not merely an axis about which one might 
suppose that the body is ‘really’ rotating. 

To calculate the motion of a body from the forces acting upon it, 
Euler proposed that one looks for the most appropriate set of axes to 
use for a given body. First, he said that one should take the axis /G 
through the centre of inertia that yields a minimum for the moment of 
inertia among all such axes. Then one should find orthogonal axes 
through the centre of inertia, J, and specifically for axes about which the 
moment of inertia is a minimum or a maximum. This is a calculus 


problem that leads to a cubic equation, which must have either one real 
root or three. However, Euler was unable to deduce from the equation 
itself that there are always three real roots, and only gave an obscure 
argument to support the claim that there are three real roots. Finally, 
Euler proclaimed that every rigid body has three axes mutually at right 
angles with respect to which the moments of inertia are either a 
maximum or a minimum. These he called the principal axes 


446. The principal axes of any body are these three axes passing 
through the centre of inertia of this body, with respect to which 
the moments of inertia are either a maximum or a minimum. 


Euler then showed how to analyse the motion of a rotating solid body in 
terms of its motion with respect to the principal axes, how the action of 
forces affects the motion, and how to solve many problems in the 
dynamics of rigid bodies. 


4.5 Darboux’s Account 


Euler’s study of Eq. (4.4), the equation for the passage of sound, was 
nicely illustrated by Gaston Darboux over a century later in his ([58], 
Vol. 2, Chap. 3). He wrote the equation in the form 


Oz m Oz n Oz p 


9 


= ——— = a — 
OxOy x-yOx x-ydy (x — yy)? 
and noted that the substitution Pdx = Rdz + Qdy turns it into one of 


the same form for vy and in which 


m=mt+a, n=nta, p=pta°+a(mtn-l). 
So it is possible to choose a value of a that makes p’ = 0 and write the 


equation in the form 
Oz b’ Oz B Oz 
— — + = =0 
OxOy x-ydOx x-ydy 


(4.6) 


Routine calculations show that if Z(6, 6’) = Z(6, 6’)(x, y) is a solution, 
then so is 

Z(1 — B',1 — Byy — xf". 
Ifwe set t = y/x,and z = x*y(r) then zis a solution of Eq. (4.6) if and 


only if y(t) satisfies the ordinary differential equation 


11 — Ny"(t) +(1-A-B- (1 -A-P'yHe'D + ABEL = 0, 


which is a differential equation that later became known as the 
hypergeometric equation and is arguably the most important ordinary 
differential equation in the history of mathematics.'° 

The canonical form of the hypergeometric equation is the equation 


2 
21 — a +(y-(a@+B+ jo — afz = 0. (4.7) 
dz az 


One solution of this equation is given by the hypergeometric series 


of Mat IBEtVo 
as TeGey 


F(a, B,y,2) = 1+ 
Ly 
This series converges in the disc |z| < 1. Euler considered only the case 


(4.8) 


in which the variable is real, and gave two accounts of the equation and 
the series respectively, one in four chapters of the /nstitutionum Calculi 
Integralis ({95], Vol. 2, Part I, Chaps. 8-11), and a later one presented to 
the St. Petersburg Academy of Science in 1778 and published 
posthumously as Euler [98]. 

It is easy, if unilluminating, to solve Euler’s hypergeometric 
equation by the method of undetermined coefficients. Let us denote 
one solution of it by F(—A, B’, 1 — A— 6, y/x) and another by 


(y/x)PF(BB +B +A, 14+ B+ A,y/x), 


then the corresponding solutions of Eq. (4.6) are 


= ZF (-A,£’, [=A =P.'y/ x); 


z= xP PV FB, p' +B+a,1+B+A,y/x. 
As Euler had already noticed, special cases arise when J is a positive 


integer. 
There are also solutions obtained by the method of separation of 
variables, z = X(x)Y(y). This leads to the equation 


xX p 4 
ae 
Xx’ Y’ 


so both sides must be a constant, @ say, and so 


X() =(x-a)¥, YO) =(y-a)*, 
and so 
z=(x-a)*(y-a)*. 


Finally, as Darboux remarked, it was shown by Paul Appell in 1882 that 
if ~(x, y) is an arbitrary solution of Eq. (4.6) then the most general 


solution is of the form 


; b b 
(ext dMey + dy o( SO OE), 


cx+d’cy+d 


where a, b, c,d are arbitrary constants and ad — bc # 0. 


4.6 Exercises 


1. The hypergeometric equation is the ordinary differential equation 


y 
1-9 + (ya +B + DN - fw = 0. 


Show that the series 


of oat VBBt+) o 
ia yee 


is a solution of the equation, and find another, linearly independent 
solution. 


F(a, B,y,x)=1+ 


The series reduces a polynomial if either a— 1 or 6—- lisa 
negative integer, and is not defined at allif 2 is a negative integer 


or zero (this case Euler excluded). Show that in all other cases the 
series is convergent for x = a+ bi provided that q? + h? < |. 


Questions 


i, 
In what ways did Euler’s approach to mechanics improve upon 


Newton's? 


Find out what you can about the motion of the Moon. Would it be 
easy or difficult to use knowledge of the motion of the Moon to 
determine longitude at sea? A fascinating take on this is provided 
by the story of Harrison’s chronometers; see what you can find out 
about them. 


Footnotes 


1 In later papers, he noted the changes that have to be made if the fluid is elastic or 
compressible. 


2 Euler wrote (=) where we have written OL; in this period, partial derivatives were often 
dx Ox 


written as ordinary derivatives enclosed in round brackets. 


3 How it acquired Laplace’s name is a long story often told elsewhere. 


4 This is only true if the fluid is, in today’s language, irrotational; Euler seems to have assumed 
this in this early work. 


5 The function S was later called the velocity potential by Helmholtz. 


6 See Carlson et al. [32]. 


7 See Frisch’s translation in the Euler Archive. He notes that Euler wrote “formules” where he 
has supplied “analytical equations”. 


8 This work has an interesting account of the ideas of Euler, D’Alembert, and Daniel 
Bernoulli on the nature of solutions to the equation for the vibrating string. 


9 Translation slightly modified from that of Jan Bruce in the Euler Archive. 


10 Asimilar equation for the propagation of sound was studied by Laplace, with less success, 
in a paper he presented to the Académie des Sciences in 1773 but which was published only in 
L777. 


11 Translation by Ian Bruce in the Euler Archive. 


12 See Euler [79], 194. 


13 See Euler [79], 195. Note that Euler’s conventions about units produce factors of 2 in 
formulas where our conventions do not. 


14 Euler here seems to have shared a naive belief that rest and motion of any kind are 
somehow different, but the separation of rest from uniform motion here may have been a 
pedagogical position. 


15 Our account concentrates on Vol. 1, Chap. 5. There is an English translation of much of the 
book by Ian Bruce in the Euler Archive. 


16 See Chaps. 11 and 16. 
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5.1 Introduction 


As methods for dealing with two or more independent variables 
advanced it became possible to pose questions about partial differential 
equations. First, as with ordinary differential equations, 
mathematicians looked for formal methods that would lead to exact, 
general solutions, and this led to questions about the existence of 
integrating factors. Euler discussed the problem of finding integrating 
factors for expressions of the form dx + a(x, y)dy in volume 1 of his 


Institutionum Calculi Integralis [94], and considered many types of 
cases without being able to show that one always existed.! However, as 
we Shall see, a solution to this problem had been published a little 
earlier by d'Alembert. The method of characteristics was introduced for 
the first time by Euler and d’Alembert, and extended by Lagrange and 
Gaspard Monge, who applied it to first- and second-order partial 
differential equations. 


5.2 Euler’s General Theory of Partial 


Differential Equations 


Euler began to think of creating a theory of partial differential 
equations in the early 1760s and outlined how he proposed to start in 
his [88, 89], where he looked at what later generations would call linear 
and quasi-linear first-order partial differential equations in two 
variables. For reasons of brevity, we shall pass to his later, more general 
account where he repeated most of this analysis and also discussed 
second-order linear partial differential equations. This is Institutionum 
Calculi Integralis Volume 3 of 1770. 

He dealt with first-order equations of various kinds in Sect. 1 of Part 
1 of the book by working through a long series of examples of steadily 
increasing complexity. He would present what we might call a theorem 
as a problem, show how to tackle it in general, comment on the 
solution, perhaps by obtaining it another way, and then work through 
some examples. As so often with Euler, it is not clear whether he was 
doing this solely on pedagogical grounds or if he was writing up a 
version of his own original route through the topic.” 

In Chap. 2, Euler began with the simplest case of a first-order partial 
differential equation: 


Oz _ 
Ox 


where a is a constant, and the solution is z = ax + f(y). He carefully 


a, 


pointed out that the arbitrary constant of integration that arises is an 
arbitrary function of y. Not only is it not necessarily given by an 
equation but also its graph may be a curve that is drawn by a free 
motion of the hand, or indeed by several such curves not connected in 
any way. This insistence on the extreme arbitrariness of the curve was 
original with Euler and not shared by all his contemporaries, and as we 
shall see, he specifically noted that the initial position of a vibrating 
string may be given by such a function. 

In this chapter, Euler solved first-order partial differential equations 
of various forms. He wrote 


OZ Oz 


= —— dg=— 
ae can Oy’ 


and set himself as Problem 21 the equation 
px+qy = 0. 


He wrote dz = pdx + qdy, eliminated q, and deduced that 


dz = p(dx — (x/y)dy) = pyd 


Both sides are therefore differentials of a function, and so he deduced 
that py must be a function of *, say 


» 


w=s(*), 
y 


where ff’ is the derivative of an arbitrary function. Therefore, 


«rp 


and by integrating both sides, he found that 


1) 


(It would seem that Euler’s idea of arbitrariness has shrunk to implying 
differentiability.) 

In this chapter, Problem 22, Euler got close to the general first-order 
linear partial differential equation (to use a more modern term). He 
wrote 


a function z(x, y) is sought for which, on writing p = z, and 


q = % one has q = pV, where V = V(x, y). 


As before, he worked with differentials—these are differential 
equations, after all. He observed that the given equation and the 
identity dz = pdx + qdy imply that 

dz = p(dx + Vdy). 


He then remarked 


Now a multiplier M will be given, likewise a function of x and y, 
so that M(dx + Vdy) is made integrable. Therefore there is put 


M(dx + Vdy) = dS, and also S will be given a function of the 


same x and y. Hence since there shall be d pds , itis evident 
ino 
M 


that the quantity £ must be equal to a function of S, whereby if 


we put P _ FS) there becomes z = f(S) and thereupon will 
M 


be 
p=Mf'(S) and q= MVf'(S). 


We shall turn to the multiplier shortly, but first, let us unpack his 
solution. We can interpret the solution this way: ifthere is a function 
M = M(x, y) such that M(dx + Vdy) is integrable, and say the integral 


is S(x, y) so 
M(dx + Vdy) = dS, 


then the solution to the partial differential equation is an arbitrary 
function of S. This is because 


dz = p(dx + Vdy) = —ds = f'(S)dS, 


so, integrating f’(S), 


ye ‘| ~ ds + G(x), 


where G(x) is an arbitrary function of x. 
Euler’s comments are interesting because they are circumspect. He 
went on 


Corollary 5.1 Therefore in this case the function sought z is found at 
once expressed in terms of x and y, because S is given by x and y. But it 
can come about the S gives rise to a transcending function, so that 
moreover by the methods so far known the multiplier M indeed cannot 
be found. 


Euler’s remark that “the multiplier M indeed cannot be found” may well 
mean that the multiplier might simply not exist. In his ([88], Sect. 20), 
he had said that this was too much to hope for, so his later opinion 
depends on what he made of d’Alembert’s discussion, published in 
d’‘Alembert [56], which we shall look at shortly. Perhaps he meant, in 
agreement with d'Alembert, that the multiplier cannot be found 
explicitly. 

With Problem 23 Euler reached what we would call the general 
first-order linear partial differential equation in two variables. He 
wrote the partial differential equations he was interested in the form 


—=V——+U, , 
: (5.1) 
where U and V are functions of x and y (see Sect. 1, Chap. 5, Problem 23, 


Sect. 146). He then solved this equation by again recalling that 
dz = pdx + qdy, passing to the differential equation 


dz = p(dx + Vdy) + Udy, 
and then looking for an integrating factor M = M(x, y) such that 


M(dx + Vady) is exact, say 


M(dx + Vdy) = dS, 


He argued that when M is found then 


pdS 
dz = —— + Udy, 
M y 
and in this equation z, and therefore U and M, can be treated as 
functions of y and S. This equation may be integrated if S is held 
constant, to yield 


z= | Udy=7 + FS), 


where T is a function of y and S. So the solution is given as a function of 
y and S. But still, Euler did not discuss how M the integrating factor, or 
multiplier as he called it, can be found, except to note that it is sufficient 
that a solution of the equation dx + Vdy = 0 can be found, and he was 


unable to give a general account of this aspect of the problem. As he put 
it: 


COROLLARY 2 Sect. 148. To this end it is convenient to consider 
the differential equation dx + Vdy = Q; for if this can be 


integrated, likewise thereupon it will be possible to deduce the 
multiplier M, so that the formula M(dx + Vdy) truly becomes 


the differential of a certain function S, which therefore hence 
may be found. 


Here as always Euler was not interested in fitting the general solution 
to a set of initial or boundary conditions. Just as with ordinary 
differential equations, the full solution was a formula with a degree of 
arbitrariness to it. 

Two questions arise: What did Euler mean by saying that a 
multiplier might not always exist but that it was sufficient that a 
solution of the ordinary differential equation dx + Vdy = 0 can be 


found, and why did he not solve the ordinary differential equation, or at 


least say that it can be solved? He had, after all, given his solution 
method for the first-order ordinary differential equation in Volume 1 of 
the Institutionum Calculi Integralis. 

To deal with the multiplier issue first, in Problem 22 the partial 
differential equation is 


vy — Vv; = 0. 
This is an equation for an unknown function gdx + pdz. What can we 
say about the curves given by v(x, y) = Vg, where v, is a constant (the 
level curves of the function v)? Along these curves, dv = 0. But we 
always have dv = v,dx + vydy, so, comparing this equation with the 


given partial differential equation, we find that 


dy vy 


dx Vy V 
so we have the ordinary differential equation 


dx + Vdy = 0. 


This has as its solutions the curves v(x, y) = Vo. If we write 


dx + Vdy = dv 
then we see that 
dv = v,(dx + —dy) = v,(dx + Vdy), 
Vx 
so the integrating factor is vj, where gdx + pdz is the solution of the 


ordinary differential equation 


dx + Vdy = 0. 


Euler could surely have written this down, and it is not clear why he did 
not. A partial answer is that he had been committed to the formalism of 


(inexact) differentials and multipliers for as many as thirty years by 
now, and saw no reason to change. Another is that he was hung up on 
the fact that the multiplier is not explicit. 

What is all the more curious about his remark about the ordinary 
differential equation is that in Volume I of the book, published in 1766, 
Euler had described a method for an approximative solution to any 
differential equation of the form 

dy 
dp eee 
In Sect. 650, Problem 85, he described how, from an initial point 
(x, y) = (a, b) one can proceed in small steps of winx to @),d,..., 


setting y successively equal to b} = b + V(a, b)w, then 
by = b; + V(a, bi )w, etc. The sequence of points (a;,b;) lie on a curve 


that approximates the solution to the differential equation through the 
initial point (a, b). 

He indicated in informal terms that the smaller the steps the more 
accurate the approximation would be, but he recognised that the 
further the process was continued the more the errors would add up 
and that this process would be particularly liable to gross error if V was 
very large or very small. To investigate the way the error can behave, 
Euler offered an argument (Sects. 656-667) that brought in more and 
more terms for the power series for y. The errors grow fastest when 
some of these terms are large, which can happen when V(x, y) becomes 
either zero or infinite, and Euler finished his account by providing 
examples to indicate how to work around this problem. 

It is far from certain that Euler thought that his remarks about 
finding an approximative solution constitute a proof that ordinary 
differential equations have a solution. He most likely thought that was 
simply true and never in need of a proof. What he offered was what he 
said it was: a method of finding an approximate solution that would 
work if handled with care. 

Itis much more difficult to understand why he did not regard the 
ordinary differential equation as solved and therefore the multiplier 
found. He might have thought that the solution method he had 


proposed was an infinite sequence of approximations, so it did not 
provide useful answers. But it is hard to believe that he thought it 
subject to any significant restrictions. 


5.2.1 Second-Order Partial Differential Equations 

Euler next turned to the subject of second-order partial differential 
equations. He began his study of the second-order linear partial 
differential equation with an explanation of what the first and second 
partial derivatives are and how they behave under changes of variable 
(see Part 1, Sect. 2, Chap. 1, Sect. 229, Problem 39). He obtained these 
equations, which express the partial derivatives of a function z with 
respect to new variables u and v that are related to the old variables u 
and v by expressions of the form u = u(x, y), etc.* 


Ox Oxdv Oxdu’ Oy Odydv Oydu 


0x2 Ax20v Ox2du \Ox) Ov? ~“Axdxdudv \Ox) Ouv2’ 


0-2 Ov Oz Pu Oz «= OVOVAz OOVOu Pz) OV Ou Pz OU 7 


Sip Ouer Bion Oxaya eueyeuly exayouey Bxopad 


Pz _ Pvdz  Pudz | (dv > Pz dvdu Pz | (du * Pz 
dy? dy?dv dy2du_ \dy} dv? “dydydudv \dy) du?’ 
In Chap. 2, he then explained how to deal with equations of the form 
Zxx = P(x, y) and showed by integrating twice that the general solution 


a= | ( f parax+ sro +60», 


where F and G are arbitrary functions. He then discussed what 
equations can be reduced to one of this form by suitable changes of 


is 


variable, and explored how to extend the method of changes of variable 
to equations of the form Z,, = P(X, y, z)Z, + O(%, y, Z). 


Section 2, Chap. 3 begins in Sect. 296 with the problem of solving 
the wave equation 


O20 2 
ey, 
Oy? Oy 


where a is a constant. Euler reduced it in the way just described, by 
showing that the substitutions 


t=ax+ By, u = yx + oy 
transform the equation to 


Oz 
2 _ ga" 
(B - aa?) 


O 
+ 2(B6 — ay + (6 —a’y”)— =0 
u 
So Euler set 
a~a=1, B=a, y=1, andéd=- 
thus reducing the equation to 
Or 
0, 
Otou 


which he had earlier shown is solved by integrating twice and has the 
complete solution z = f(t) + F(u), where fand F are arbitrary 


functions. 
Euler then wrote 


COROLLARY 1 Sect. 297. Therefore the value z of this is equal to 
the sum of two arbitrary functions, the one is of p = Z,, and 


other of p = z,, and both these functions thus can be assumed 


at will, so that also discontinuous functions are able to be taken 
in place of these. 


COROLLARY 2 Sect. 298. Therefore any two curves described 
freely by the hand as it pleases are able to be taken according to 
this usage. Evidently if in one the abscissa is taken as = x + ay, 


and in the other truly the abscissa = x + ay, then the sum of the 


applied lines [i.e. the y -coordinates- Editor’s note] will always 
put in place a suitable value for the function z. 


The first corollary is Euler’s way of saying first that the solution is a 
sum of an arbitrary function of p = z, and an arbitrary function of 


P = Z,. The second one says that if the coordinates are changed to 
u = u(x, y) and v = x — ay then the solution is the sum u, + v, = 0. 


Euler then compared his solution method with d’Alembert’s, noting 
that he admitted a much greater range of candidates for the functions f 
and F than d’Alembert had done.° Of more interest in the present 
context is Euler’s Scholium 3 (Sect. 301), where he remarked that our 
solution has this disadvantage because it leads to an imaginary 
expression for this equation 


a. 2 
ea + ot =0, (*) 
evidently 
Re f(x + ay V-1) + F(x - ay V-1). 
He then noted that 


f= sitet ay V-1) + afte ay V-1) 


1 1 
F v-1 
= (x + ay ae = 


will always be real. However, although Euler was confident that this 
reduction to real values would always be possible for curves that he 
called analytic (i.e. were given as explicit functions of a real variable) or 


+ 


F(x — ay V-1) 


could be represented as series of sines and cosines, he explicitly 
doubted that this would be possible for arbitrary curves, drawn as he 
put it by the free motion of the hand. He therefore concluded that this 
was “a great defect in the calculation, on account of which a great many 
solutions lose their power”. 

This means that although Euler had given a recipe for producing a 
solution to the equation (*) by starting with a real expression such as 
x” or sin x (and obtaining | — K*p4 or sin xcoshay) he could not be 


sure that all solutions were of this kind.° This reflects a deeper lack of 
knowledge about the passage from complex functions to harmonic 
functions that only became available after Riemann. 
Moreover, the equation (*) is unique among equations of the form 
Oz Oz Oz Oz Oz 
A— +2bB—— + C— +k—4+8—+Tz+V =90, 

Oy? Oxdy Ox? Oy Ox 
where A, B, and C are functions of x andy and 2 — AC < 0, in that it 
has a “twin” with B* — AC < 0, and so Euler had no way in to analyse 


any of the others. 

He therefore confined his attention to partial differential equations 
that do not raise this problem, and Chap. 3 steadily establishes that all 
second-order linear partial differential equations of the form 
Pz iw Oz OZ Oz 

~ 2P— +(P?- O’)—— + R—+S—+Tz+V=0, 
Oy Oxy ( g 52 Oy Ox 


where the coefficients are functions of x and y, can be reduced to ones 
of the form 
OZ Oz Oz 

= Pi tO Fhe ts 7 = 0. 

ae 
All the necessary formulae for the changes of variable had been set out 
in Chap. 1. 

His conclusion was that the new variables u and v must satisfy the 

partial differential equations 


Uu 


ov 
Oy 


Ov Ou Ou 
(P+ Qa and By LP O)- 
These are of the form that Euler had already shown how to solve.’ 

I add a few remarks about the generality of the solution to the wave 
equation.° D’Alembert always maintained that for a function to be a 
solution of the equation it had first to be a candidate, and for that 
reason analytic; he tried to reject the idea that there might be classes of 
functions to which the calculus did not automatically apply. Challenged 
by Euler, he gave a geometric argument that rested on the idea that the 
curvature of the string must vary continuously, which a modern reader 
could interpret as saying a solution must be twice continuously 
differentiable as a function of each variable. But Euler was willing to 
contemplate more general solutions, and so he gave arguments to show 
that at points where the solution curve is not differentiable there are 
other solutions that are infinitely close, and so any errors that are 
introduced at such points will be negligible. 

What about the intermediate case 


Oz Oz, 

Ox? Oy” 
Because only one second-order term appears this is of the form that 
Euler had treated in Chap. 2. But for this specific equation, his only 
comments (Sect. 265) were that his methods do not apply and it is 


“allowed to be understood that the resolution of this has to be thought 
out with the greatest hardship”. 


5.3 The Introduction of Characteristics by 
d'Alembert 


D’Alembert was already interested in these questions and had a fruitful 
insight into the question of the existence of integrating factors after he 
met Euler in Berlin in 1762, that form a response to Euler’s papers of 
1763 and 1764. This is his theory of characteristics.” 


In the fourth volume of his Opuscules, d’Alembert argued that given 
the equation M(dx + Vdy), where a = a(x, y), it is possible at each 


point on the y-axis to draw through it a curve along which 
M(dx + Vdy).'° In this way, one obtains a family of curves, one through 


each point of the y-axis, along which M(dx + Vdy). Then, if M is the 
factor that makes the differential du = dx — idz exact, the integral of 


this exact differential will be constant along these curves and only vary 
from curve to curve. So, it is enough to prescribe the values of M 
arbitrarily along the y-axis for the function M to be known everywhere 
in the plane. This is the basic method of characteristics: the curves 
d’Alembert defined are known as the characteristic curves and the 
method is valid at least locally and for as long as the curves are not 
tangent to the y-axis. 

D’Alembert described the values of M at each point as a height 
above the (x, y)-plane, so the curves of constant M are curves on 
constant height, which we might call level curves or contour lines. 

He also explained that if M is an integrating factor for a differential 
Q\,d2,...,80 M(dx + ady) = du, say, then so is M times any function of 


u, because f(u)M(dx + ady) = f(u)du. 


He did not connect this insight to the method of changing variables 
when solving a first-order partial differential equation. Had he done so, 
he could have said that replacing x by the new variable u reduces the 
original partial differential equation to one of the form z, = 0, whose 


solutions are z is an arbitrary function of the other variable y, so the 
solution is known everywhere once it is known on the y-axis. The 
corresponding differential is dy which is exact, as is any differential 
f(y)dy. This suggests that even at this stage the notion of boundary 
values for a partial differential equation was not clear or established. 

D’Alembert was clear that the sought-for function M would be far 
from unique, and he commented that “it can often happen that M will 
not be expressible algebraically although it can always be determined 
geometrically”. This was his reply to Euler who, in his ([88], Sect. 20) 
had said that this was too much to hope for. But D’Alembert was 


unhappy that his method was far from being explicit, and so he set out a 
tentative method for finding an expression for M “when this can be 
done”. This method was little more than the hope that there will be a 
change of variables, obtained by regarding x and y as functions of u and 
z, that will make the transformed equation tractable because it is now 
written in variables that separate. 

What became known as the characteristic curves are the solutions 
of the differential equation M(dx + Vdy), and the value of z(x, y) along 


each such curve is determined by the value at an arbitrary given point 
on it. So, to solve the partial differential equation one finds a curve A 


that is not a characteristic curve but crosses all the characteristic 
curves, and assigns a value to the function z at each point on A, and 


then solves the ordinary differential equation (5.7) with those initial 
values. Under the heading of the method of characteristics, this became 
a standard technique in the theory of linear and quasi-linear partial 
differential equations. 


5.4 Laplace 


In his memoir (1777), Pierre Simon Laplace, a protégé of d’Alembert’s, 
presented what he claimed was the first systematic treatment of linear 
partial differential equations, one that went beyond the isolated cases 
treated by d’Alembert and others.'! He praised d'Alembert as the 
inventor of the calculus of partial differential equations, and mentioned 
neither Euler nor Lagrange by name, perhaps because d’Alembert had 
been helpful to him early in the younger man’s career, by securing him a 
professorship in mathematics at the Ecole Militaire in 1769, the year 
he turned 20, and d’Alembert and Euler had been rivals until the 1760s. 

Laplace began with the first-order equation, which we shall write in 
a form equivalent to his as 

Oz Oz 


ae + dy = V, (5.2) 


where @ and “are functions of x and y and V is a function of x and y if 


the equation is linear. (Laplace allowed V to be a function of x and y 
with a linear term in Z, a slight generality that we suppress here.) This 
is the same equation as Euler’s equation (5.1), and his solution method 
was the same as Euler’s, except for a little more clarity about the 
integrating factor that arises. Like Euler, he said nothing about the 
solutions being constant along any characteristic curves. 

Then, in his ({174], 21-41), Laplace turned to the second-order 
linear partial differential equation, which he wrote in the form 


Zxx + Oxy + B2y + YZ + 62% +AZ+T = 0, (5.3) 
where a, 6, y, 0, 2, and T are functions of x and y. He looked for a 
change of variables u = u(x, y), v = v(x, y) that would put the equation 
in the form 


y = cos kct X constant. 


From the values of the various partial derivatives of z with respect to 
the new variables, he deduced that the coefficients of z,,, and Z,, are, 


respectively, 
2 
un + QU Uy + Buy, and v2 + QVxVy + By;, 


so, for the transformed equation to reduce to the required form, the 
new variables must satisfy the differential equations 


Ve + QUyUy + Bu, =), v2 + AVxVy + By, =). (5.4) 


He factorised these to get these equations'?: 


Ux = Uy(—a@/2 + \{(a/2)? -B), vx = vy(-a@/2 — 4(a/2)? — B), (5.5) 


which are of the type he had shown how to solve earlier in the memoir, 
as we described above. Indeed, to solve them, use the fact that 


u = u(x, y) implies that u,dx + u,dy = 0, so in Eqs. (5.4) we may 


replace u,/uy by —dy/dx. This leads to the equation 


(dy) — adxdy + B(dxy = (), 


which we write as 


dy : dy = 
(>) a +B=0. 


This is a quadratic equation whose solutions are 


2 2 -a/2+ Jla/2?-p=0 and 2 =-0/2- Yla/2?-p=7 


The characteristic curves are the solutions of these equations. 
Along these curves Eqs. (5.4) hold, so the partial differential 
equation reduces to 


Zuv = O, 
which has as its solutions anything of the form u, + Vv, = 0. So once a 


characteristic curve is found that crosses these curves, the values of the 
solution are found, just as in the simple case of the wave equation, by 
saying that fis constant along with one set of curves and g is constant 
along with the other. 

Laplace then attempted to solve the original Eq. (5.3) in its reduced 
form so as to obtain its general solution, and to supply extra 
information to provide particular solutions. He set no store by what we, 
however, would see as an essential difference between the cases where 


; l(a/2)2 =8 is real and where it is complex. In the real case, the 


change of variables produces new real variables and real characteristic 
curves, but in the complex case, the new variables are complex and the 
characteristic curves are complex. In this latter circumstance, the only 
hope for Laplace is that the imaginary parts will somehow vanish at the 
end. But he made no remark to that effect, and it would seem that he 


thought that the reduction is always possible. This is remarkable, given 
that Euler had raised an alarm in the simplest case, and because it 
breaks an analogy (which perhaps Laplace missed) with the reduction 
of a curve given by a quadratic equation in two variables to either an 
ellipse or a hyperbola (or a parabola).'* No mathematician would have 
thought that linear transformations could confuse an ellipse with a 
hyperbola. 

This is further evidence that the understanding of partial 
differential equations in the mid-1770s was often purely formal, and 
many significant issues remained to be discovered. 

We shall just consider the real case. Now there are new variables u 
and v that, respectively, satisfy the equations 


and with respect to these new variables the partial differential 
equation takes the form 
07z 0z OZ 
+a—+b—+cz=0, 5.6 
Oudv Ou Ov 8) 

in which a, b, c are functions of u and v. This equation includes, as a 
particular case, the equation for the transmission of sound, which 
Euler was to consider in 1776 (as we saw above). Laplace put forward 
some ingenious but complicated solution methods that we shall not be 
able to pursue. 


5.4.1 Lagrange’s Method 
In his [174], Lagrange took the equation 


Zz =Va4+Z 


where V is a function of x and y, and Z is a function of x, y, z, and 
deduced using the identity adx = ydy — xdy that 


dz = (Vdx + dy)zy + Zdx. 


He then went on Lagrange [174], 83: 


Suppose for a moment that 
Vdx + dy = 0, (5.7) 


then I have an equation in two variables, which I integrate, 
adding an arbitrary constant a. Now| regard q@ as a function of 


x and y determined by this equation; by differentiation I have 


Vdx + dy = Ada, 


A being a function of x, y and a. So, substituting this value in the 
preceding equation it becomes 


dz = Azyda + Zax. 


Now, if one replaces y everywhere in this equation by its value in 
terms of x and a one thence has an equation in x, z, and @, and 


supposing @ constant one will have the equation f = x + 1z 


between the two variables x and z, which can be integrated with 
the addition of an arbitrary constant, which will be an arbitrary 
function of a, thus giving the general solution of the proposed 


equation at once, because it only remains to replace q@ by its 


value in x and y. 


This is close to being the first statement of the modern approach, that 
puts the emphasis on some ordinary differential equations and drops 
any consideration of differentials and multipliers.'* 


5.5 Exercises 
1. Solve the partial differential equations 


px + qy = 0. 


q = pV, 


g=pV+U, 
and compare your solutions with Euler’s. 


Questions 


1. 
What sorts of solutions did Euler admit to the partial differential 
equation 
200 
Ox 


What sorts of functions would Euler or d’‘Alembert have supposed 
prescribed values along a transversal to a family of characteristics? 


Footnotes 


1 This volume was published in 1768. It and the next volume deal with ordinary differential 
equations, the third volume with partial differential equations. There is a useful English 
translation of all three books by Ian Bruce, available at the Euler Archive. 


2 The translation is taken, with slight modification, from Ian Bruce’s version on the Euler 
Archive. I have changed Euler’s notation to make it more like ours. 


3 From this it follows that 
dT = Udy + Wds, 


i dz. BP ’ 
and so, using the fact that £ =£,4= W+ f'(S)- 


4 Euler wrote t where we have written v here and throughout. 


5 D’Alembert’s arbitrary functions were nonetheless regarded by him as analytic because, in 
his view, the calculus applied to them. Euler was more and more of the opinion that the initial 
conditions for the wave equation could be given by much more general curves. 


6 He did not say this explicitly but left the result to be deduced from his formulae. 


7 Laplace did the same thing, perhaps independently, in his treatment of partial differential 
equations in his [174]. 


8 See Liitzen’s [191], 15-23, which also considers Lagrange’s ideas in this regard. 


9 For a revision of the mathematics, see Appendix B. 


10 See d’Alembert [56], pp. 225-281, and especially 255-258. 


11 On Laplace, see Gillispie [120]. 


12 This corrects a trivial error in Laplace’s paper: he wrote a for y(t) in the square root. 


13 Strictly speaking, the quadratic must be non-degenerate. 


14 Lagrange also showed how to vary the argument when V and Z may involve the derivatives 
of the unknown function. 
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6.1 Introduction 


The first systematic theories of first- and second-order partial 
differential equations were developed by Lagrange and Monge in the 
late eighteenth century. Lagrange’s approach was not as 
overwhelmingly algebraic as the bulk of his work might suggest; in 
particular, he seems to have introduced the idea of envelopes of curves 
and surfaces into the study of differential equations. Monge’s work is, 
however, the start of the modern geometric theory of partial differential 
equations, and we shall defer consideration of it to Sect. 8.2 below. 


6.2 Clairaut’s Paradox 


In 1734, Clairaut had raised a paradoxical finding that can be described 
in modern terms as follows. Let dy _ and consider the differential 


dx 


equation 


y= xp + f(p). 
Differentiating this gives, 
p=pt+xp +p f'(p), 
or 
p(xt f'(p)) = 90. 


This, combined with the original equation, gives a solution in the 
parameterised form 


x=—f'(p), y=xp + f(p). 


What struck Clairaut and others as remarkable was that the solution to 
a differential equation had been found by further differentiating and not 
by integrating, as was to be expected. 

When Euler wrote about this some twenty years later (in his E236) 
he opened by saying 


I propose here to study a paradox in the integral calculus that 
can appear very strange: 

We sometimes encounter differential equations for which it 
would seem very difficult to find the integrals by the rules of 
integral calculus, and which however are easy to find, not by 
means of an integration, but rather by differentiating the 
proposed equation again; so a repeated differentiation leads us 
in these cases to the sought-for integral. It is undoubtedly a very 
surprising accident, that differentiation can lead us to the same 
goal, that we are accustomed to find by integration, which is an 
entirely opposite operation. 


Euler went on to connect this paradox to another unexpected result: a 
differential equation may be solved by an expression that does not arise 
from the general solution by a choice of the arbitrary constant it 
contains.! He gave this example, 


xdx + ydy = dy ,/x? + y? - a’. 


It has the circle with equation g? + }? + c? = 0, which implies 


xdx + ydy = 0, as a solution, and also the general solution, which is a 


one-parameter family of parabolas 


a[x2ty*-az=y+e 


dz = Azyda + Zdx. 


or 


but the circle cannot be obtained as one of a family of parabolas. 


G 
v 


Fig. 6.1 A one-parameter family of parabolas enveloping a circle 


Euler’s explanation of the paradox in this paper did not get close to 
solving it. He returned to the question in 1763 in the first volume of his 
Institutiones Calculi Integralis (Part 1, Sect. 2, Chap. 4, see Sect. 594 of 
the English translation), but he looked for an algebraic criterion to 
resolve it, and that does not get to the heart of the matter. 


The envelope of a family of curves is a curve that touches each 
member of the family. To put that another way, it is a curve that has, at 
the point where it meets a curve of the family, the same tangent as the 
curve of the family does. It is a good exercise to check that the 
parabolas in Fig. 6.1 envelope the circle in this sense. 

Let the family of curves by given by the equations y = f(x + c?f), 


where a is a parameter that varies from curve to curve. Let the curve 
u = u(x, y) be the equation of the envelope. Differentiating the first 


equation says that, at points on curves in the family, we have a + da, so 


TAX ys a)dx ~ Les y; a)dy + Free y, a)da = (). 
If we fix a value of the parameter a then da = () and so at points on that 


specific curve in the family 
dy fx ya) 
dx f(x, y, a) 


Differentiating the second equation says that, at points on curve 
u = u(x, y), we have da = 0, so 


TAX, y, a)dx + AO, y, a)dy = 0, SO 


d x ’ 
8x(x, y)dx + By(X, y)dy =0, and ay = _§ (x, y) 


dx gy(x,y) 


The slopes of the tangent at the curve with parameter a and the curve 
u = u(x, y) will therefore be the same at a point (x, y) where they meet 


if at that point f(x, y,a) = 0. So the envelope is the set of points where 
both y = f(x +ct) and f,(x, y, a) = 0, so it is found by eliminating a 


between those equations. 
One way of looking at envelopes is to imagine the curves 
y = f(x + ct) forming a surface in (x, y, a) space. The equation 


f(x, y, aldx + fi(x, y, a)dy + fa(x, y, a)da = 0 


can be written as 

(fix, y, a), HOY, a), falx, y, a)).(dx, dy, da) = 0, 
which says that (f(x, y, a), f(x, y, a), fa(x, y, @)) is a normal to this 
surface. The further condition that f(x, y, a) = 0 says that the normals 


of interest are those that lie in the (x, y)-plane, and this picks out the 
part of the surface that you see if you look down the a-axis from a long 
way away. 


6.3 Lagrange 


A decade later, Lagrange (see Fig. 6.2) made a systematic study of first- 
order partial differential equations in his three papers, [173-175]. We 
shall look at his method for solving a general first-order partial 
differential equation in his (1772); his account of complete, general, 
and particular solutions from his [173] where he solved Clairaut’s 
problem; briefly at his discussion of problems involving more than two 
independent variables in his [175]; and finally at his lectures on the 
subject at the Ecole Polytechnique in 1806. 


Fig. 6.2 Joseph Louis Lagrange (1736-1815), artist unknown 


6.3.1 Lagrange [173] 


In his [173], Lagrange considered the general first-order non-linear 
partial differential equation in two variables x and y for an unknown 
function u(x, y). This was an ambitious undertaking, given what little 
had previously been discovered about partial differential equations, 
and he modelled his approach, naturally enough, on what was known 
about linear equations and the method of integrating factors.” 

A partial differential equation is defined by a function F of the five 
variables x, y, u, p, q that satisfies the equation g + ip = F(x — iz), 


where, as he wrote Ou and Ou , and so, when u is 
pP=> =, q=— Huy 
Ox Oy 


known as a function of x and y, du = pdx + qdy. The crucial difficulty 


with a non-linear equation is that p and q no longer occur to the first 


power but may be squared, multiplied by u, or in other novel 
combinations. 

Lagrange supposed that the partial differential equation can be 
written in the form g = g(x, y, u, p). The problem, as he now saw it, 


was to find p in terms of u, x, and y so that the expression 
dz = pdx + qdy is integrable. Before we see why this is true, note that 


this had been observed by Euler in his ([96], Sect. 128), who remarked 
that the necessary condition is that, on setting 


one has 
Lp+Mq+N=0, or p—-q—-+—-—=0 


However, Euler made no remarks on how this equation could be 
satisfied, and turned to other aspects of the theory. 

Lagrange’s (and Euler’s) observation is valid because the existence 
of an integrating factor M such that M(du — pdx — qdy) is exact implies 
that 

0 O 0 0 
pot -qe+e-S=0, (6.1) 
Ou “Ou Oy Ox 


and conversely, a solution p of Eq. (6.1) implies that an integrating 
factor L for v = x — ay can be made to supply a suitable M. 


To show this, Lagrange argued along these lines. Fix a value of u, 
then there is an integrating factor L such that L(pdx + qdy) = dt, 


where t¢ is a function of x and y; L is found by solving the ordinary 
differential equation pdx + gdy = 0. Now let u vary. It is clear that 


ee “du _at=0 is integrable. He defined P=L4 “ and showed 
u u 


that condition that P as a function of t, y, and u is a function of tand u 
only is (6.1). He then let L’ be the integrating factor for Pdu — dt, and 


deduced that L’L(du — pdx — qdy) is exact, and so LL’ = M. 


Let p be a solution of (6.1). Lagrange now showed, by an argument | 
omit, that it is enough that p contains an arbitrary constant fora 
complete solution to the original partial differential equation 
q+ip = F(x — iz) to be found. 


So in principle, Lagrange had discovered a general method for 
solving non-linear first-order partial differential equations in two 
independent variables, although he had to admit that it could be too 
complicated to follow even in cases when the solution was already 
known, for example, for the equation 


q=pxX+yY, 
where X = X(x,y) and v(x, y) = Vo. 


True though it is that the various conditions on L, L’, and M can be 
translated into the equations for the characteristics that define the 
modern solution (see Appendix C), the summary account of Lagrange’s 
work by Eduard von Weber in the German Encyclopedia goes too far in 
implying that this was known to Lagrange. The use of characteristics is 
an important later development.” 

Still less was there progress on partial differential equations in 
more than two independent variables, despite some inconclusive 
remarks at the end of the paper. Here, the problem at the time was that 
the integrating factor method cannot work: there is no good theory of 
an integrating factor for expressions of the form adx = ydy — xdy 


because the analogue of Clairaut’s equations yields an over-determined 
system consisting of three equations for the integrating factor. 


6.3.2 Lagrange [173] 


We now turn to his account in his [173] of the types of solution a partial 
differential equation may have, and the relationships between what he 
called complete, general, and particular solutions. 


Before looking at it, it will help to look at the account in Courant and 
Hilbert ([49], Vol. 2, 22-27), which is clearer. Suppose that a first-order 
partial differential equation F(x, y,z, p,q) = 0 has a family of 


solutions z = f(x, y, a) that depend on a parameter a. If this family has 


an envelope, then this envelope is also a solution. Geometrically, this is 
clear: the envelope shares a tangent plane at the point of contact with 
the curve with parameter a, and this plane, therefore, belongs to the 
family of tangent planes defined by the partial differential equation. 

A complete integral of the given partial differential equation is one 
that depends on two independent parameters, so the solution is of the 
form z = f(x,y, a, b). Given a complete integral, one can impose an 


entirely arbitrary relation between a and b, say k€ = Nz, and then the 
one-parameter family of surfaces Pdx = Rdz + Qdy has an envelope 


which is again a solution. What makes this valuable is that the new 
solution is obtained by differentiation (which is easy) and elimination 
of a parameter (which, however, may not be easy). So a general solution 
is one that depends on one parameter, either by imposing a relation on 
the parameters such as kf = Nz or is an envelope. 


Finally, singular solutions of the partial differential equation are 
obtained from the two-parameter family of solutions as those 
envelopes that do not arise from a one-parameter family. 

Gaspard Monge gave this example in his Applications d’analyse 
[202], Chap. 7. Consider the two-parameter family of surfaces 


(x-aY +(y-bY +2 =1. 
They are all spheres of radius 1 with centres in the (x, y)-plane, and it is 


easy to see that these surfaces all satisfy the partial differential 
equation 


e(tmtz)=l. 


We shall see that this makes the surfaces what is called a complete 
integral of the partial differential equation. 


Consider the one-parameter family, where kf = Na—this selects 
just those spheres with centres on the curve p = |/a in the (x, y)- 


plane. The envelope of this family is obtained by eliminating the 
parameter a from the equations 


(x-aY’+(y—b)/ +2 =1, and x-—a+b'(a\y — D(a) = 0. 


It is a tubular surface, called a canal by Monge. 
The planes z = | and da = O are an envelope of the two-parameter 


family of spheres, and they satisfy the partial differential equation. They 
are singular solutions of the partial differential equation, and they are 
not a tubular surface. 

In his [174], Lagrange offered what he believed was the first general 
analysis of the problematic phenomenon first discovered by Clairaut in 
1734. He noted that Euler had discussed it, as by then had d’Alembert, 
Condorcet, and Laplace (in a paper seen by Lagrange but not yet 
published). Now Lagrange proposed to offer a new and complete 
analysis. 

He took Euler’s example of the differential equation 


xdx + ydy = dy |x? + y* - a’. 
which has the solution 
A[xr-t+y-a=yrte 


x -2cy-a’-c’ =0, 


or 


where c is an arbitrary constant, and worked his way round to 
explaining that the circle is the envelope of the family of parabolas. 
Lagrange now looked for the a priori reason for the existence of the 
phenomenon. He found that the differential equation specifies a 
condition on the solutions that is also met by the envelope of the 
complete integral, and he expressed this condition first in formal 


analytic terms and then in geometric terms. He also went on to note 
what happens when there is no envelope or only a one-point envelope; 
this explains other aspects of the original question. 

In Article V of the paper, Lagrange then extended this analysis to 
partial differential equations. A partial differential equation in three 
variables has two independent variables, so a complete integral will 
have two arbitrary constants, he explained. A general integral has one, 
and a particular integral none. 


6.3.3 Lagrange [175] 
The paper Lagrange [175] deals mostly with particular kinds of partial 
differential equations in which some geometric condition is imposed, 
and in it Lagrange also generalised some arguments he had used in his 
[173] to address the question of solving first-order quasi-linear partial 
differential equations (to use a more recent technical term) with any 
number of independent variables.* 

He wrote the partial differential equation in a form that differs only 
notationally from this: 


where z is an unknown function of the z = | variables x, x|,...x, and 
P,,...P,, and Z are known functions of x, x),...X,, and z. 
His method was to form the n ordinary differential equations 
dx, AXn 
a ee 


These can be solved, and the solutions involve n arbitrary constants 
udx + vdy. If these constants are regarded as functions of x, and y is an 


arbitrary function of udx + vdy, then the solution of the partial 


differential equation is given in the form 


a1 = Y(Q2,...,Qq). 


To see what this means, it helps to let x = 0 and to work in tf = x + iz 
space, which is three-dimensional. We now fix values of a; and q;. 
This gives us a curve in f = x + iz space as x varies. If we now let a be 
a function of a, then these curves form a surface. This surface is a 


solution of the partial differential equation because it is composed of 
curves that must lie in the solution surface because their tangents 
satisfy the partial differential equation. 

Lagrange did not prove this claim but said that the proof was 
contained in his paper of 1774. His comment on the method is 
interesting [175], 625: 


By this method one can therefore integrate in general every 
first-order partial differential equation in which the 
differentials only appear in a linear form, whatever the number 
of variables; at least the integration of these sorts of equations is 
reduced to that of some ordinary differential equations; but one 
knows that the art of the integral Calculus of partial differential 
equations only consists of reducing this calculus to that of 
ordinary differential equations, and that one regards a partial 
differential equation as integrated when its integral depends 
only on that of one or more ordinary differential equations. 


6.4 Exercises 

1. 
A ladder slides down a wall which is at right angles to the ground. 
What curve does it envelope? (Hint: use the angle of the ladder to 
the vertical as a parameter.) 

2. Verify the claims made in connection with Clairaut’s paradox. Show 
that if f(x, y,a) = x* + 2ay — a* — b? = 0 then eliminating the 


parameter a from that equation and the equation a f(x, y,a) =0 
la 9 9 


yields the equation y = OQ, from which the result follows. 


Questions 


1. 
How important would you say that one-parameter families of 


curves were in the mathematics of the early eighteenth century? 


Footnotes 


1 See Capobianco, Enea, and Ferraro [29] for a discussion this problem posed for Euler’s ideas 
about the foundations of the calculus. 


2 See Engelsman [67]. 
3 See Weber [264], 338, written in 1900. 


4. We shall pick up that story when we look at mechanics and Hamilton-Jacobi theory in 
Chap. 25. 
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7.1 Introduction 


Problems in which the solution is a curve (or perhaps a function) with 
some maximal or minimal property began to be studied at the end of 
the seventeenth century, as we saw in Chap. 2. Euler was the first to 
make a systematic study of problems of this kind, and his book the 
Methodus inveniendi (1744, E 65), which is full of different kinds of 
examples stimulated the young Lagrange to invent the methods of the 
calculus of variations. These were not the modern methods, but an 
inspired, and mostly unexplained, formalism that was very useful but 
by no means clear. Out of these insights, the so-called Euler- 
Lagrange equations were discovered, the law of least action was 
proclaimed, and Lagrangian dynamics was created. 


7.2 The Euler-Lagrange Equations Discovered 


There are few topics in the history of mathematics that have a clear 
beginning. Most emerge out of a shifting array of problems, undergo 
various reformulations, split into branches, merge with others, and 
acquire new aspects. The calculus of variations went through these 
preliminary stages so quickly that it may almost be said to have begun 


with Euler in his Methodus inveniendi. Here, Euler set out a general 
method for finding functions that minimise or maximise a given 
integral, and are perhaps subject to other constraints. 

Euler thought of the calculus in the language of infinitesimals.' So 
he took three points on a curve that is the graphical representation of y 
a function of x, say (x,y), (x’, y’), and (x”, y’”) that are infinitesimal 


distances apart and varied the curve so that it passed through the 
infinitesimally nearby point (x’, y’ + nv) instead. He thought of these 


changes as if they were finite and then treated them as infinitesimal, 
not as limits of finite changes. 

Euler now supposed that the curve is the one that maximises or 
minimises a certain integral over an interval of a function Z of x, y, and 
perhaps some derivatives of y with respect to x. Because the integral is 
an extreme, the effect of these infinitesimal changes must be zero. 
Euler let the infinitesimal horizontal distances be dx, the change in y at 
x be dy, and defined p by the equation dy = pdx, with similar changes 


at x =x+dxand x” =x’ +dx. 


He then considered the effect of these changes on the integral. The 
change is a sum of infinitesimal amounts that reflect the new value of y 
at x’ and the consequent changes in any other quantity that enters the 


integral, so the changes are Zdx + Z’dx + ---, where Z is the value of Z 
at (x,y, p), Z’ is the value of Z at (x’, y’, p’), and so on. But the change in 


the curve is concentrated in the infinitesimal region around the points 
(x’, y’) and (x’, y’ + nv). So the change in the integral is Zdx + Z’dx. 


Euler wrote 
dZ = Mdx + Ndy + Pdp, dZ’ = M’dx + N'dy’ + P’dp’, 
and proceeded to calculate M, N, P, M’, N’, P’ in terms of the change 


nv to the curve. He found that 


nv nv 
dZ = P—, dZ’ = N’nv- P’—. 
dx ia dx 


So the change in the integral is given by 
(dZ + dZ')dx = nv(P + N'dx - P’), 


and this, because the integral is an extremum, is zero. 
Euler wrote P’ — P = dP, replaced N’ by N, and obtained 


Ndx-dP=Oor 


N-—=0 (7.1) 


as a necessary condition for the curve to be an extremal of the integral. 
This is the first occurrence of the Euler-Lagrange equations in the 
calculus of variations. 

The confusion this exposition induces is not simply a matter of the 
use of infinitesimals. As Fraser notes ([106], 185), Euler has used the d 
symbol in two ways. First, to denote a fixed infinitesimal separation of 
the x coordinates, and this was standard practice in the Leibnizian 
calculus of the day. Second, to denote the change in quantity 
consequent upon the change in y’, and among these we find that 


dy’ = nv, M —-iN =7T, and dp’ = —nv/dx while in this sense 
dx = dy = dp” = (). Euler then extended this method to apply to 


variations subject to constraints, which incidentally had the effect of 
highlighting the two uses of the d symbol. 

The next year Euler received a response to his book that surprised 
and delighted him. This was a letter written by the 17-year-old 
Lagrange on 12 August 1755 in which he set out a new method for 
tackling these problems, which he illustrated with the solution of three 
examples. 

In the letter, Lagrange introduced the symbol y for the variation in 


the curve, so, for example, 6x = 0. He wrote 6F'y to denote the change 


in F, a function of y, and claimed with very little justification that 
dx + Vdy = 0. (7.2) 


and so in particular déy = édy. His method thereafter is a judicious 


combination of the new rule in Eq. (7.2) and integration by parts, and 
among other results Lagrange obtained a better derivation of equation 
(7.1), which now deserves to be called the Euler-Lagrange equation. 
As Fraser notes ({106], 163) Lagrange surely introduced his new 
symbol y to sort out Euler’s ambiguous use of d, then somehow came 


to the belief that dand vy commute, and had a good idea (which 


Euler had missed) of using integration by parts. This is most useful 
when the integral has fixed end points because the variation is 
necessarily zero there. 

A rich correspondence between the two men followed. Euler did 
not, at first, appreciate the way Lagrange’s v works. He also queried the 


crucial step in which the passage from the vanishing of the integral that 
expresses the variation in the original integral leads to the vanishing of 
the integrand and thus the Euler-Lagrange equation. Lagrange, in his 
reply, gave a general argument that remains unconvincing. 

Between 1756 and 1760, Lagrange refined his method and focussed 
it on problems in mechanics, including the brachistochrone problem, 
and in 1760-1761 he published his own account of the calculus of 
variations [170]. The argument he set out in this paper is opaque at key 
stages, and as the calculus of variations developed numerous 
interpretations were provided that culminated in what today is called 
the direct method. As Fraser points out, the modern approach is not a 
reasonable interpretation of what Lagrange did, and the reader is 
referred to Fraser’s paper for the details. 

Lagrange argued that to find the minimum or maximum of an 
integral, say {[Z one does as one does in the calculus and 


differentiates and equates to zero: 


5 | z=0 


Here, Z is to be regarded as a function of variables x, y, z and their 
differences dx, dy, dz, d*x, d*y, d*z, .... The equation he said—offering 


no explanation—could also be written as 


[ez=0 


So one writes Z out in full, and “as one sees easily” 
ddx = déx, 6d*x = d’6x, 


“and so for the others”. 
He then suggested that in any given problem, there is always some 
relationship between k = 27/€, dox, doy,.... 


We can at least look at how Lagrange determined the 
brachistochrone or curve of quickest descent between two points.” He 
took x, y, and z as a set of three mutually perpendicular axes—with the 
x axis vertical—so the time of descent is given by 


ds 
Z Z=—, ds= .,/dx2+dy?+dz. 
J We \ : 


The integrand Z involves four terms, dx, dy, dz, ds, so according to 
Lagrange’s rules 6Z is a sum of these four terms: 

éxds dxddx dyddy dzddz 

2xVx ~Vxds xds -xds’ 
and all other quantities in the general theory vanish. In his terminology, 


ds dx dy | dz 
‘he 


xfer’ a Vxds - Vxds Vxds- 


To find the curve of quickest descent, one therefore has the equations 


n—dp=0, -dP = 0, -d& =0, 


and so 


ds dx 0 dy 0 dz 
QxV¥x Vxds  ~ Vxds Vxds 
For these three equations to represent a unique curve it is necessary, he 
said, that they reduce to two, which they do because, as he showed, the 


second and third imply the first. 
He then integrated these two equations and obtained 


dy — 1 dz 1 
Vxds ya Vxds Vb 


0. 


whence 


dy vb 

dz Va 
This equation shows that the solution curve lies in a vertical plane. He 
now assumed that the x-axis passes through the curve, which takes care 
of the constant of integration when integrating both sides of the 
equation, and gave the equation of the plane as 


va 
Z=y—. 


Vb 


Lagrange then took coordinates x and t in this plane, where 


fy? +2=t This gave him z as a function of t and y as a function of t, 


and finally, on setting a = c,a differential equation connecting x and 
a 


t: 


Vxdx 
Ve=x 


are 


which is “the equation for a cycloid described on a horizontal base by a 


circle of diameter equal to c’”.° 


7.3 Maupertuis and the Principle of Least 


Action 


Abstract though it is, the principle of least action was the occasion for 
the most unpleasant scientific controversy of the century. Pierre Louis 
Maupertuis, the head of the Berlin Academy, presented a paper to the 
Berlin Academy in 1744 in which he claimed to show that light travels 
in a way that continually minimises a quantity called its action (a 
concept to be defined below). He published an extended version of the 
same argument in 1746, which he now applied to the motion ofa 
mechanical system. He also gave the principle a profoundly teleological 
spin by suggesting that the system evolved to meet a pre-assigned goal, 
and made it the animating principle of all of nature and the basis of a 
proof of the existence of God. As he observed, Euler had made a strictly 
mathematical statement of the principle in his Methodus inveniendi (E 
65, [74]), but Maupertuis’s claim was far grander. 

In essence, Maupertuis’s claim was that a benign deity had seen to it 
to produce a world in which everything happened with a minimum of 
effort, or rather, that a world in which things happened with minimal 
effort was evidence for a benign deity. Quite why action corresponded 
to hard work was not clear, and the whole claim was ridiculed by 
Voltaire in 1759 in one of the great books of the Enlightenment, 
Candide. Voltaire could not accept that we lived in the best of all 
possible worlds when all too painfully it was the world of the Seven 
Years War and the Lisbon earthquake of 1755. 

Maupertuis was a vain man who courted fame. He was known as 
‘The Great Flattener’ because he had led the successful French 
expedition to Lapland in the late 1730s that verified the flattening of 
the Earth at its poles and helped to persuade Continental Europe of the 
merits of the Newtonian calculus and Newton’s theory of gravitation.* 
He was friends with Voltaire, and, importantly for this story, the Swiss 
Academician Samuel Konig, who Voltaire had persuaded to teach his 
lover Emilie du Chatelet algebra.° 


However, in 1751, Samuel Konig published a paper criticising the 
principle of least action. Maupertuis took it personally and accused the 
author of plagiarism. He charged Konig with forging a letter by 
Leibniz stating the principle of least action, which would have denied 
Maupertuis priority. When this failed, he obtained Euler’s support and 
tried to have Konig driven out of the Academy. It seems that Euler, who 
was generally a benign man, usually got his way with Maupertuis by 
humouring him, but felt that on this occasion he had to give something 
back. Maupertuis’s actions enraged Voltaire, who promptly published a 
pamphlet entitled Diatribe du docteur Akakia, medicin du Pape. 
Emperor Frederick, who took a lordly interest in his Academy in Berlin, 
publicly supported Maupertuis, and was so enraged by Voltaire’s 
pamphlet that he had it burned by the hangman in public places in 
Berlin. Matters eventually calmed down, and in 1752 Konig was given 
an official censure by the Academy but allowed to remain as an 
Academician. 

What, more precisely, was the principle of least action? 

Maupertuis expressed it this way: 


Whenever there is a change in nature, the quantity of action 
necessary for this change is as small as possible. The quantity of 
action is the product of the mass of the body by its speed and by 
the distance through which it has moved. 


Euler, in the second appendix to his Methodus inveniendi, wrote that the 
path of a body of mass M moving with a speed that it would have 
acquired by falling through a height v travels a distance ds has a 
quantity of motion of Mds /y and 


I say that the path the body will describe, by comparison with all 
the others with the same start and end points, will minimise 


{ Mds vv, or, if M is a constant, fds Vv 


If we say that in falling a height h from rest a body of mass M loses an 
amount of potential energy equal to Mgh, which is equal to the amount 


of kinetic energy it gains, | 5Mu*» then ;, — and Euler is claiming that 
2g 


the path of the particle minimises the integral 


| aR 


which is what Maupertuis said, and which we recognise as the product 
of the momentum of the body times the distance through which it has 
moved in an instant. If we also note that ,, = a, then we can say that 


the integralis “ [ ;2d1. 
cal 


This agrees with the definition of the action in use today, which, in 
problems involving particle motion, is the difference of the kinetic and 
potential energies in the motion. If the kinetic energy is represented by 
T and the potential energy by V, then the action is NV = 1, andin 


problems about motion under gravity, we have that B,C,..., so 
T-V=2T. 


The crucial point is that whatever deductions follow from the new 
principle have to agree with those that follow Newton’s laws. So the 
correct formula for the action is not a new quantity; it must be a 
disguised version of one already known, and the minimising principle 
can only be a disguised version of Newton’s laws. Or, of course, it could 
be the other way round: the action principle is fundamental, Newton's 
laws follow from it, and we just happened to have discovered them first, 
which, given the origin of mechanics in astronomy was surely 
inevitable. Either way, the philosophical implications Maupertuis drew 
were ultimately spun out of a misunderstanding. 

Before we dismiss them, however, it is worth observing that they 
are curious. It is not difficult to imagine a particle feeling a force at 
every instant and responding to it. It is much harder to imagine a 
particle considering every possible path between two points before 
deciding which one to take. In the first case, one can imagine the 
particle needs no help deciding what to do, but in the second case, one 


is tempted to imagine it appealing to an all-knowing higher authority, 
which is the essence of Maupertuis’s theological argument. 

That said, we are left with the problem of showing the equivalence 
of the principle of least action and Newton’s laws; however, a valid 
modern derivation requires more theory than we have at present.° 

There is also a strong pragmatic reason for preferring the principle 
of least action. It can be much easier to use in problems in dynamics 
because it fits very well with the framework of generalised coordinates, 
to which we now turn. 


7.4 Euler's Later Approach 


In his paper (E420, [97]), Euler returned to the calculus of variations 
and derived its fundamental equations in a way that pointed the way to 
all future treatments. Given the problem of finding an extremal of an 
integral involving function y(x), Euler supposed there was a one- 
parameter family of functions y(x, t), such as y(x, t) = y(x) + tV(x), so 


that y(x) can be approximated arbitrarily closely. He then considered 
the variation in t: 


evaluated at ¢ = 0, and argued that at an extremal this variation should 


vanish. 
The differential of a function Z(x, y) is, as he wrote, 
dZ = Mdx + Ndy, where yy = a and if the variation in y is all that is 


considered then da = 0 and dy = 2 dt so one can write = = 
t t t 


He wrote higher derivatives as 


2 2 _ Fy ,_ ag _Fp_ By 


Ox’ = Ox Ox?’ eo 0x2 9x3? °"’ 


and 


dp _ Oy dq_ Gy or_ ay 
Ot Oxdt Ot dt Ot dxdt’ 


So now, if 


dZ = Mdx + Ndy + Pdp + Qdq+Rdr+--: 


then, holding x fixed, so da = O, the infinitesimal variation in y 


produces 
dy Op é’y Py ory 
dy = —dt, d —dt = ——dt = d = d 
a oe ae axare? 4 Ba? OT Bx” 
So, the variation in Z is given by 
OL Oy Ory Oy Ory 
—dt = N—dt + P——dt + O——dt + R——dt + ::: 
Ot Ot OxOt oar O02 x0t 


The variation of an integral goes like this: 


5 | Zax | szax = | Spatax = ar f Fate 
Ot Ot 


Applying this to the expansion of %, Euler deduced that variation of 


the integral is given by the power series 


2 3 
an f (vars Prax + 0 a ax+---}} 


Ox0t Ox?0t 


He then integrated by parts, restricted attention to variations that 
vanish at the boundary, and deduced that for an extremal 


Oy dP dO 
ar f arp (w- F482} =0. 


and therefore that the Euler-Lagrange equation holds: 


He discussed the geometric meaning of these assumptions in a section 
at the end of the paper, and remarked that x + dx means that y(x, t) 


and y(x) agree at the boundary points, doy = 0 means that they have 


parallel tangents at the boundary, and so on. 
Euler also showed how to extend the subject to find the equations 
governing the extremal of integrals of functions of two variables. 


7.5 Brachistochrone and the Calculus of 


Variations 
The Euler-Lagrange equation for an integrand F(x, y, y’) is 
d 


wee =f, = 0. 


Remember that this involves the total differential with respect to x, not 
the partial derivative, so we have for any differentiable function G 


d , / yt 
qe J= Git Guy Gy 


Now for the brachistochrone, or curve of quickest descent. The problem 
is to find the curve along which a frictionless mass point sliding along 
the curve will descend under gravity (acting in the y direction) from 

A = (Xo, 0) to B = (x1, y,) in the shortest time, given that its initial 


velocity is zero. When the point has descended a distance y its vertical 
velocity will be 2ey- At that moment, in an instant of time dt it moves 


a distance ds, where 


Xx 


2 
ds’ = (dx dy’) = [i + (2 Jae =e + y)dx?, 


so at each instant of time 


l+y? 
HPS y dx. 
2gy 
So the time of descent is 
Xx] 1 + p2 
T { ae 
i 2gy 
We write fg y? 
F@,y,y) = 
2gy 


To compute the Euler-Lagrange equation, set ) = (] + yi ye and 
(2g) '/* = a,s0 


F(x, y, y’) = ahy |”. 
Now calculate 6Z, 4p, £y,,, Fy,and F,\, and verify that the Euler- 
ax sage y y 
Lagrange equation simplifies to 
1 1 
yyy + =Wy”? = Wy’y + <i, 
vo) 2 
and therefore to 
, y) I / 
(y — hyy"y = shld — y"). 
All is not lost, because /;,2 — ae — ],s0 the Euler-Lagrange equation 


becomes 


1 1 
-y"y = —h = -(l+y”). 
se ae 7 y’) 


Verify that the equation 
| 2 
atl eS ie. ’ 
ae a a ae 
is obtained by differentiating 


yd + y*) =k, 


where k is a constant, and deduce that 


Set y=k sin?(z) and deduce that the Euler-Lagrange equation has 
become 
2k sin? zdz = dx, 
and so 
k ; 
— 5 (2% — sin 2z). 


So x and y have been expressed in terms of a parameter z, and the 
brachistochrone is found to be a cycloid obtained by a point on the 
circumference of a wheel of radius k that rolls on the x-axis. 


7.6 Generalised Coordinates 


This material is here for later use, in Chaps. 24 and 25. 

Lagrange introduced the method of generalised coordinates in 
mechanics in a memoir of 1788, when he returned to a problem he had 
looked at in 1764 on the libration of the Moon.’ The virtue of his 
method was that it enabled him to choose new variables with which to 
analyse libration, and this led to new differential equations that led to 
solutions that were easier to interpret. 


Equally importantly, in those intervening years, as Fraser [105] 
discusses, Lagrange moved away from relying on the principle of least 
action as a fundamental principle in physics. This principle, as 
Maupertuis and even Euler had described it, had a metaphysical aspect 
that Lagrange increasingly disliked. He much preferred to put his trust 
in formal, algebraic arguments. By 1788, when Lagrange published his 
Méchanique Analitique, he disparaged the use of such phrases as “least 
action” 


as if these vague and arbitrary denominations comprise the 
essence of the laws of mechanics and can by some secret virtue 
establish in final causes the simple results of the known laws of 
mechanics. 


He might have picked up this attitude from his mentor, d’Alembert. Or, 
as Fraser speculates, he might also have come to realise that his 
approach, which grew out of d‘Alembert’s, was more general, which it is 
because it does not require the forces in a problem to be given by a 
potential function. The calculus of variations remained fundamental to 
Lagrange’s approach, not because it led to the formulation of a physical 
problem as the stationary value of an integral but because it led to a 
formulation in terms of differential equations. 

The manner in which Lagrange presented his new theory of 
mechanics is not easy to read or to describe, and in the absence of a 
thorough modern account that makes it easy to read him carefully and 
accurately I have chosen to follow Lutzen ([192], 640-642) and Pulte 
[231] and to indicate the outlines of his achievements while 
suppressing the details of his methods.® 

We shall suppose that what is at issue is the motion of n particles, 
and that the jth particle has mass m; and coordinates k = Nz/€ with 


respect to some Cartesian frame of reference. Lagrange introduced new 
variables (x,y) = (a,b), N = 3n, and supposed that every v,, y;,and 


z; is a function of the new variables (x, y) = (a, b). 


It is clear that in principle any system of equations that expresses 
the motion of the n particles can be written as a system of equations in 


the new variables. Specifically, consider the kinetic energy of the 
system, 


Ie ae eS 
Ba RG) 


Assume also that the forces are conservative, which means that they are 
given by the gradient of a potential function z = X(x)Y(y). Then T will 


be a function of the variables g},...,9n,91,..-gn, and U will bea 
function of the variables udx + vdy. 


What Lagrange showed was that the equations of motion can be 
written as 


dor = OT ou 
dt 0g; 0g; 0g; 
Later writers introduced the ‘Lagrangian’ x, x;,...x, in terms of which 


the equations of motion take the form 


Lagrange was at least clear about the advantages of his method. Even 
the simplest mechanical problem expressed in rectangular Cartesian 
coordinates is tiresome to convert to a coordinate system better 
adapted to the problem, as, for example, spherical coordinates often 
are. The new coordinates are entirely general, so the equations are 
coordinate-free, and symmetries of the problem turn up as simpler 
systems of equations. 


Because these equations can have various forms that are less 
simple or more simple, and above all easier to integrate, it is not 
a matter of indifference in which form they are presented at the 
start; and it is perhaps one of the principal advantages of our 
method that it always provides the equations for each problem 
in the most simple form relative to the variables that it employs, 


and puts one in a position to judge in advance which are the 
variables to use that will most simplify the integration. Here, for 
this purpose, are some general principles that one will see 
applied in what follows to the solution of different problems. 
(Lagrange, Oeuvres Vol. 11, 336-337.) 


7.7 Exercises 


1. 
Prove that the shortest curve joining two points in the plane is the 


straight line between them. 


Show that the curve in the (x, y)-plane joining the points da = 0 


and (a, b) that produces the surface of least area when rotated 
around the x-axis is the catenary y = a cosh x for a suitable value 


of a. This is equivalent to minimising the integral 
fe y(1 xf yy dx 


Questions 


1. 
If Newton’s laws and the principle of least action give the same 


answers to the same problems in dynamics, what is the difference 
between them? 


Find and enjoy articles on Jakob Steiner and the isoperimetric 
problem. 


The physicist Richard Feynman greatly appreciated the principle of 
least action and made it the basis of his theory of quantum 
mechanics. Find out what you can about his views and arguments 
—he provided numerous accessible accounts. 


Footnotes 


1 A good account of Euler’s method is provided in Fraser [106]. Importantly, it does not follow 
the accounts of Carathéodory [31] and Goldstine [121] in interpreting Euler’s arguments by 
bringing them into line with later methods, which obscures some of the points that 

Lagrange was to criticise. For English translations of some of the work of Euler and Lagrange, 
see Struik Source Book 399-413. 


2 See Lagrange ([170], 339-341). 


3 Itis easier to follow this argument on choosing the axes so that everything happens in the 
plane z = 1. See also the derivation in Sect. 7.5. 


4 See Terrall [253] for a detailed account. 


5 Du Chatelet is remembered as the mathematician who translated Newton’s Principia into 
French. For an account of the difficulties this involved, and the merits of her extensive 
commentaries, see Zinsser [278]. 


6 Itis nicely explained in the Notes for the Harvard course Mechanics 151, see (http://www. 
people.fas.harvard.edu/~djmorin/chap6.pdf). 


7 Libration is the slow oscillation in the motion of the Moon that enables us to see a little more 
than half of its surface (about 59%). It is largely a result of the elliptical orbit of the Moon. 
d’Alembert had published theoretical papers on it in 1761, and Cassini and Mayer had 
published observational accounts. 


8 For the original treatment, see Lagrange Mécanique analytique [178], reprinted in Lagrange 
Oeuvres 11, Part 2, Sect. IV, especially pp. 334 and 336. 
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8.1 Introduction 


Lagrange’s account of the solution of first-order partial differential 
equations is, at its core, formal and algebraic and not easy to 
understand. The account that has become the basis of all further 
elementary accounts was published in 1809 by Monge, in his 
Application de l’analyse a la géométrie—see the Addition, pp. 367-414 
—and it brings out his remarkable geometrical gifts. Here, we look 
briefly at its origins in Monge’s earlier work, and how he tried to extend 
these ideas to second-order partial differential equations. 


8.2 Monge and First-Order Partial Differential 
Equation 

Monge (see Fig. 8.1) gave two accounts of how to solve partial 
differential equations in two memoirs in the Histoire de [Académie 
Royale des Sciences for 1784 (published in 1787). The first eliminates 


arbitrary constants and arbitrary functions from an equation by 
repeated differentiation to show that the result is a partial differential 


equation. The second paper then reverses this idea and shows how 
arbitrary functions arise in the solutions of partial differential 
equations. 


Fig. 8.1 Gaspard Monge (1746-1818), artist unknown 


In §3 of his (1787b), Monge took up the quasi-linear first-order 
partial differential equation: 


Mp+Nq+L=0, 


where M, N, L are functions of x, y, and the unknown function z, and 
p =Z,and g = Z). 


He argued that the partial differential equation cannot be solved for 
p and q, yet, paradoxically, it seems that it can be. For, using the 
equation dz = pdx + gdy, one immediately obtains these two 


equations for p and q: 


Mdz + Ldx = q(Mdy — Ndx), 


Ndz + Ldy = p(Ndx — May). 


Monge’s way out of this apparent impossibility was to argue that these 
equations are to be understood as a set of restrictions on any function z 
that satisfies the partial differential equation. They say that the system 
of differential equations 


Mdy-Ndx=0, Mdz+Ldx=0, Ndz+Ldy=0 


can be solved simultaneously (any two imply the third) and their 
solution, for any arbitrary point ZN = OC, defines a curve through 


that point that lies in the solution surface. In other words, all solution 
surfaces through that point have this curve in common. 

To obtain the solution of the partial differential equation 
Monge then said that if two of these equations, or two of their 
consequences, can be integrated explicitly, say to provide equations of 
the form 


f(%y,zZ)=a and g(x,y,x)=b, 


where a and Db are constants, then the complete integral of the partial 
differential equation will be of the form 


f(x,y, Z) = (g(x, y, Z)), 


where is an arbitrary function. 


8.2.1 A Comparison with the Modern Account 


We write the first-order quasi-linear partial differential equation in the 
form 


ap +bq=c, 
where 
a = ax, y, Z), b = D(x, y, Z), c= c(x,y, 2), 


and, as usual, p = 2x, = Z: 


The solution to this partial differential equation will be a surface 
with an equation of the form z = z(x, y), and any tangent to sucha 


surface at the point (x, y, z) is of the form (dx, dy, dz), which is normal to 
x+y =0,so the curve 


(a(x(t), W(t), (2), DOW), YO), A), CCX), VD), 2), 


where x(t), y(t), z(¢) are functions of a variable t, is everywhere tangent 
to this surface, and so the curves that satisfy 


ax ody dz SGD 2c 
or 
dx dy dz 
— =a, —=0), T= 
dt dt dt 
lie in the surface, and indeed fill it out. They are called the 
characteristic curves and they project down onto curves in the (x, y) 
plane that were called the characteristic curves before. 
It is a simple, instructive, and reassuring exercise to connect these 


equations to the equations that Monge derived in 1784 (write 
everything in terms of £). 


Cc 


This suggests that we think of directional vectors at each point of 
space: at the point (x, y, z), there is the vector 
(a(x, y, Z), b(x, y, z), c(x, y, Z)). So if we are given acurve I in space in 
the form (x(s), y(s), z(s)) say for 0 < s < 1, then the characteristic 


curves through the curve I’, which we assume are not tangent to I, will 
form the solution surface, and this is indeed the case—I omit the proof, 
but note that there is something to prove. 

It is helpful to give an example. We take the partial differential 
equation 


zp+g=1, 


with initial conditions 


C= 5; VS Cos s- sz Suns, Oa = I; 


The transversality condition to be met is that along this curve 


dy _dax 
— —hb— #0, 
vas ds = 
and we have 
M(dx + Vdy) = ds, 


which is correct for 0 < s < 2/2 (indeed, until cog 5 = 1/2(1 — V5): 


The solution of the partial differential equation is then given by 
solving the system of ordinary differential equations 


dx _ ay dz, 
A a ae 
for which 
1 
x= 5 +at+B, y=tryzatta, 


and finding the solution that meets the initial conditions, which are that 
when ft = 0 


p= s)Y =COSS,;0 = Sis. 


So, the solution of the partial differential equation that meets the initial 
conditions is 


] ; 
x= sf +rsins +5, y=t+coss, z=f+sins. 


8.2.2 The General First-Order Case 


Before we return to Monge, it is instructive to see how far one can go 
pursuing his method but on the general first-order partial differential 
equation. 

We write the partial differential equation in the form 


FO Y:2, 50) =0. 


The aim is to get equations for dx, dy, and dz. We shall also need 
equations for dp and dq, which we did not need before because p and q 
were presented linearly then. 

We can differentiate this equation with respect to p and obtain 


(8.1) 
As with the quasi-linear case, it is helpful to think of a surface with an 
equation z = 2(x, y) that satisfies this partial differential equation. Let 
ZN = OC bea point on this surface, then the tangent plane to the 
surface at that point satisfies the equation 

Z— % = p(x — Xo) + GY — Yo), 
If we differentiate the equation for the tangent plane with respect to p, 
we obtain 


d 
0=x-x0 + Yoo, 
Pp 


and by thinking of (x, y) as infinitesimally close to (x9, yo) we write this 


as 
dq 
dx + dy— = 0. 


We eliminate ct from Eqs. (8.1) and (8.2) and deduce that 
p 


dx _ dy 
Fy, Fg 


A nice piece of elementary algebra tells us that therefore 
(8.3) 


or, equivalently, that 

dx d z 

—=F,, ca ie Fj, — 

dt dt dt 
These are equations for curves that lie in the solution surface, but they 
are not enough to determine the solution surface given some initial 
data, because the equations still contain p and q. We can of course 
assume that locally without much loss of generality that we can solve 
the partial differential equation for q and obtain 


= pF + qk g. 


q = 4X0; Yo, 20; P): 
This is an equation for all the planes at the point ZN = OC that 


envelope a cone that touches the solution surface at that point. It 
became known as the Monge cone. 
However, purely formally, we have 


dp _ dx dy 


a ea iP Die DxF pp + Pylg, 
dq dx dy 
ae ie + = GxF py + QyF ag, 


and from the partial differential equation we obtain by differentiating 
with respect to x and y 


Fy + Fp + Fyp, + Fgqx = 9, 


Py +g $i py Fagy = 0. 


So, the previous two equations can be written as 


dp 
yo eee 


These equations, and the three before, give a set of five that reduce the 
solution of a first-order partial differential equation F(x, y, z, p,q) = 0 


to little more than algebra: 


_ =F, (8.4) 
“ =F, (8.5) 
: = pF ,+qFq (8.6) 
2 ==F,— pr, (8.7) 
- = —Fy — F.q. (8.8) 


Geometrically, these equations determine a curve that lies in the 
solution surface and a family of tangent planes that are tangent not only 
to the curve but—this is the contribution of the equations for 7 and 

t 


“1—also to the surface. They define what came to be called a 
t 


characteristic strip. 

The solution of the partial differential equation then proceeds 
much as before. An initial curve in space is given, and if it is crossed by 
a family of characteristic strips that are never tangent to it then these 
strips determine a unique solution, at least in a neighbourhood of the 
initial curve. 

For a statement of the existence and uniqueness theorem for the 
first-order partial differential equation 


F(x, y, Z, p,q) = 0 
and a sketch of its proof, see the Appendix (Chap. C). 


8.3 Monge on General First-Order Equation 


Monge reworked and extended this analysis in his [202]. His argument 
demonstrates his acute geometrical vision, but for that reason, it is not 
easy to follow. The comparison with Cauchy’s later, analytic method 
(see Sect. 31.1) is instructive both mathematically and historically. 

Monge first satisfied himself that any problem involving the 
envelope of a one-parameter family of surfaces is a problem involving a 
first-order partial differential equation: 


F(x, y,Z, Pq) = 9. 

To do so, he began with a one-parameter family of surfaces 
Mp+Nq+L=0, 

where n is a function of @, say 6 = y(a), and considered what he 


called the envelope of these surfaces. 
If S. is a surface in the family, and S is the surface they envelope, 


then S$, touches S along a curve C, that Monge called a characteristic 
because it characterises the contact of S, and S. We are to think of the 
surface S$, moving, changing its shape as it goes, and sweeping out a 
surface S. Ateach moment, S$, and S touch along Cy, so we can also 
think of Cy as sweeping out the surface S. 


Monge considered the tangent plane to the envelope S at a point P 
and noted that there are distinguished directions at the point. The 
tangent plane can roll in many ways, but two stand out: along the 
characteristic, and about the tangent to the characteristic (thought of 
as an axis). In the first case, the tangent plane stays on the same 


characteristic; in the second case, it pushes in a direction that leaves the 
characteristic and defines a new curve that Monge called a trajectory. 

Thus motivated, he considered a partial differential equation of the 
above form. Any solution of it will be a surface and locally therefore of 
the form z = z(x, y). Differentiating F gives 


Xdx + Ydy + Zdz+ Pdp+ Qdq = 0, 
and because one always has 
dz = pdx + qdy (8.9) 
Monge deduced that 
(X + pZ)dx + (Y + gZ)dy + Pdp + Qdgq = 0. (8.10) 
He then looked at the characteristics with parameters a and a + da, 
which amounts to letting 
dp=0, dq=0, 
and deduced that 
g(t, s) = f(ct + s) + g(ct—s), (8.11) 


which defines the projection of a trajectory on the (x, y)-plane. This 
equation is equally well obtained by stipulating that Pdp + Qdq = 0, 


so any surface satisfying this condition will touch the curve. 

Varying the tangent plane by rolling it along the envelope generates 
a developable surface, and this allowed Monge to differentiate 
du = pdx + qdy keeping dx, dy, dz fixed. In this way Monge obtained 


the equation 
dpdx + dqdy = () 


for the line common to two ‘neighbouring’ tangent planes. If in 
particular the tangent plane rolls along the characteristic, and therefore 


rotates about trajectory, then the value of - is given by Eq. (8.11) and 
Xx 


sO 
(X + pZ)dq — (Y + qZ)dp = 0. (8.12) 
Similarly, if the moving plane rolls along a trajectory, then 
Pdy — Qdx = 0, (8.13) 


which is an equation for the characteristic. 

So, Monge concluded, the Eqs. (8.9), (8.10), (8.12), and (8.13), 
belong to the characteristic. 

Monge then observed that these four equations involve the 
differentials dp, dg, dx, dy, dz and that one can deduce ten equations 


each of which involves only two differentials. He listed these equations 
on p. 380 but, as he pointed out, only four are independent: 
Gx ay. Ge. dp | dx 
P QO Pp+QOp X+pZ  Y+qZ 


(8.14) 


When it came to solving these equations, Monge dealt first with what 
he called the linear case 


Pp + Qq=L, 


where P, Q, and L involves only x, y, and z. In this case, the equations 
that are necessary are only 


Pdy- Qdx=0, Pdz=Ldx=0, Qdx-Ldy=0, 


which describe the projections of the characteristics on the three 
coordinate planes. 

He noted that if these differential equations are easy to solve then 
one should do so, but if they are more intractable, then because they 
define a curve, one can regard x, y, and z as all being functions of a 
single variable, that one might as well take to be z. In which case, 
eliminating y between these equations leads to a second-order ordinary 
differential equation for x as a function of z. He then showed how, in 


this case, the envelope can be regarded as swept out by a curve that 
moves and changes its form in space in a way that is prescribed by the 
partial differential equation. 

Monge then turned to the general case. His method was to see what 
can be done by regarding everything possible as a function of p. Four of 
the ten equations he had listed before involve either 4, 42, <, or 44 

dp dt dp dp 
in an equation with x, y, z, p, g. Systematic elimination produces a third- 
order ordinary differential equation for q as a function of p that 
involves only p and q but not x, y, or z. 

Monge then, in a way, I shall not describe, showed how to produce 
from this equation the solution to the partial differential equation. But 
he admitted that this process could lead to long analytic difficulties and 
that it would be more useful to turn to interesting special cases, with 
which he proceeded to conclude his account. 

I omit Monge’s lengthy description of the most general case, 
because, as he put it himself (p. 409): 


The geometrical considerations on which we have based the 
study of the equations for the characteristics are familiar to 
students at the Ecole Polytechnique, but they can be hard going 
for other readers, and we shall therefore derive the same 
equations by a purely analytical process. We shall begin with the 
case of linear equations, and then pass to the general case. 


Monge also discussed the difficulties that arise in the many cases in 
which the equations for the characteristics are not immediately 
integrable and then turned his attention to partial differential 
equations that are reducible to the above (quasi-) linear form. Among 
these was a class of developable surfaces with equation 


FZ — px — gy, p,q) = 9 
that, Monge remarked, 


includes a great many of those that M. Lagrange has treated in 
his beautiful work on particular integrals, printed in the 
Mémoires de lAcadémie de Berlin for the year 1774. 


In fact, Monge’s equations are enough to solve the equation 
F(x, y, Z, p, g) = 0. Given a point ZN = OC and x — iz such that 


F(X0; Yo, 203 Pos Yo) = O 


the equations determine a curve through the point and a set of planes, 
one for each point on the curve, along which the planes are tangent toa 
surface z = z(x, y) that satisfies the partial differential equation. This 


curve is a characteristic curve, and the curve and these tangent planes 
define a characteristic strip. If the initial point lies on a curve that is 
transversal to a family of characteristics, then the family of planes 
envelope a surface z = z(x, y) that satisfies the partial differential 


equation. 


8.4 Monge on Second-Order Partial 


Differential Equation 


In his ({[201], Sect. 8) Monge extended these methods to the study of 
various second-order partial differential equations, which he wrote in 
the form 


Ar+Bs+Ct+D=0, 


where A, B, C, D are arbitrary functions of x, y, z, p, g. He now argued 
that the partial differential equation cannot yield expressions for 1, s, 
and t, but the use of the equations! 


dp=rdx+sdy and dq = sdx +tdy 


seemingly leads to these expressions for them: 


Bdpdy + Cdqdy — Cdpdx + Ddy* = -r(Ady” — Bdxdy + Cdx’), 


Adpdy + Cdqdx + Ddxdy = s(Ady* — Bdxdy + Cdx’), 


Adpdx — Adqdy + Bdqdx + Ddx* = -t(Ady* — Bdxdy + Cdx’). 


These equations cannot hold in general, so if they hold simultaneously 
it must be along certain curves in the solution surface. If these 
equations 


Mdx+Ndz, Ndx—-— Mdz 
Bdpdy + Cdqdy — Cdpdx + Ddy’ = 0, 
Adpdy + Cdqdx + Ddxdy = 0, 


Adpdx — Adqdy + Bdgdx + Ddx’ = 0, 


hold simultaneously (any two imply the other two), and if two of these 
equations have the solutions y = 0 and x = 0, where a and b are 


arbitrary constants of integration, then the general solution of the 
partial differential equation is €(x — iz), where y is an arbitrary 


function. 
He gave some examples of how his method works in practice. In §12 
he supposed that A, B, C, D are constants. The equation 


Ady’ — Bdxdy + Cdx’ =0 
gives rise to equations 
dy—kdx=Q0 and dy-k’dx=0, 
and so 
p=Mf'(S) and q= MVf'(S). 
where k and k’ are the roots of the equation Ak? — Bk + C = (0. The 


equation 


Adpdy + Cdqdx + Ddxdy = 0 
then becomes 
Akdp + Cdq + Dkdx, 
which implies 
y(t, s) = F(t) xX G(s) . 


Monge deduced that a first integral of the partial differential equation 
is 


Akp + Cq + Dkx = y'(y — kx), 
and another is 
Ak’p + Cq + Dk'x =W(y-k’x), 
where the functions ¢ and ware arbitrary, so he deduced that the 


general solution of the original partial differential equation is 
1 
Az + 5Dx = oy — kx) + Wy — k’x). 


Monge did not distinguish the cases when k, k’ are real and when they 


are complex conjugate (or even when they are equal), and this 
indifference to concerns raised by Euler forty years earlier may be 
down to Monge’s optimism that the problem Euler had pointed to could 
be solved more easily than was to turn out to be the case. 

Monge’s next example (§13) was the partial differential equation 
for surfaces generated by a line moving in space while remaining 
parallel to the (x, y)-plane: 


q’r —2pqs + p’t = 0. 
His method leads to the equations 


g-dy’ + 2pqdxdy + pdx? =0 and 


g dpdy + p*dydx. 
The first leads to 
(qdy + pdx) =): 


which implies that z = 1, a constant. The second then implies that 


pdq—qdp =0 
and so g = bp where bis a constant, and after a little work the solution 


of the partial differential equation is found to be 
x + yo(Z) = WZ), 


a result already known. (Exercise 1 below asks you to derive the partial 
differential equation generated by a line moving as described.) 


8.5 Lagrange at the Ecole Polytechnique, 1806 


In his lectures in 1806 at the still-new Ecole Polytechnique 

Lagrange gave another account, which was very close to the one 
Monge had given some years before. He began Lecture 20 by observing 
that there are three immediate consequences of an equation 

F(x, y, z) = 0, which are obtained by finding the total differential of F 


and taking the partial derivatives with respect to x and y: 


t=ax+ Py, u=yx+oy 
F,.+ F,z, = 9, 


dx + Vdy = 0. 


Lagrange then showed that these are useful when looking for a solution 
of a first-order partial differential equation, by giving two examples. In 
the first 


Zx + Mz, =N, 


where M and N are constants. Lagrange introduced the ‘primitive’ 
equation 


z-Nx = 9(—yv — Mx), 


and differentiated it partially with respect to x and y. The resulting two 
equations imply that z = Nx + y(y — Mx) isa solution of the partial 


differential equation. 

In the second example, M and N are now regarded as functions of 
x, y, Z. If the solution (or primitive equation) is of the form 
F(x, y, z) = 0 then partial differentiation of F with respect to x and y 


implies that 
dF =F,+MF,+NF,=0, 


and by the first of the three consequences above, regarding x, y, z as 
functions of a variable t, the partial differential equation becomes 


dF = (dy — Mdx)F, + (dz — Ndx)F; = 0. 


So, said Lagrange, the solution of the partial differential equation is 
found by solving (the fourth and fifth consequences) 


(dy — Mdx) = 0 and (dz — Ndx) = 0. 


This is exactly what we have already seen in Monge’s treatment of the 
quasi-linear equation. 

The solution of these ordinary differential equations introduces two 
arbitrary constants a and b, and the result of eliminating the variable z 
from them will be a second-order equation in x and y in which one can 
set da = () if one wants to regard y as a function of x, or dv = O if one 


wants to regard x as a function of y. Then z can be found as a function of 
x and y. 


The answer is now given as an implicit function F(x, y, z) = Oin 
which the constants a and b appear, which Lagrange wrote as (a, b), 


highlighting that a and b are functions of each other. Moreover, the 

function F involves only these two constants (other than the ones in M 
and N), and so eliminating two of the variables x, y, z will eliminate the 
third (because dF = 0 and so F cannot be a function of the remaining 


variable). The primitive equations yield expressions for a and b as 
functions of x, y, z, say a = P(x, y,z) and coskctcosks, so the primitive 


equation becomes 0 < s < 77/2. 


Then he returned to the case of two independent variables and, as it 
were, ran the above argument backwards to show how first-order 
partial differential equations arise by eliminating the two parameters a 
and b from an equation of the form 


F(x, y, Zz, a, b) = 0, 


which he called the complete primitive equation (he required that a 
single differentiation cannot eliminate both parameters at the same 
time). This equation, he showed, leads to a more general one involving 
an arbitrary function, on supposing that = x + ay and letting a satisfy 


0 
gt ee Play) = 0. 


Finally, the singular primitive equation is obtained by letting a and b 
satisfy 


0 0 
ao apr = 9 


He gave this example. The partial differential equation 
Z = XZ + yZy 


has the complete primitive equation 


z=axtby. 
To obtain the primitive general equation, he set = x + ay and took 


derivatives with respect to a only, thus finding 
z=ax+yy(a), x+yy'(a) = 0, (8.15) 
from which it was necessary to eliminate a. Because y was arbitrary, 


this led to an infinity of different complete primitives, each with two 
arbitrary constants. If, for example, 


yi 


a 
-~A-—, 
y(a) AB 


the procedure just outlined results in 


2B 2 
a = — andz = Ay+ — 
y 2y 


as a new form for the primitive equation. 

In this way, he remarked, one can find as many complete primitives 
as one likes, but the general primitive equation is never among them. In 
the present case, it is impossible to find a function y(a) such that the 


resulting primitive equation is z = Ax + By. I omit the proof. 


Lagrange then discussed the theory of envelopes and its role in the 
theory of partial differential equations and remarked (p. 348) 


One can see, in the writings of Monge, the theory of the 
generation of these surfaces and the equations that can 
represent them developed to its full extent and with particular 
and ingenious considerations that belong to them. 


He then returned to the solution method for first-order partial 
differential equations that he had discussed 34 years earlier, in his 
[173], and noted that some difficulties remained to be resolved that, 
moreover, he had not been able to treat in the Théorie des fonctions. 


Lagrange’s method for tackling first-order partial differential 
equations is frequently presented in textbooks as the Lagrange-Charpit 
method or even as Charpit’s method. Many historians say that we know 
almost nothing about Charpit, and report only that Lacroix said that 
Paul Charpit submitted a memoir in 1784 on the general solution of 
first-order partial differential equations, but it was never published and 
that Charpit died that year (28 December 1784).° However, as Grattan- 
Guinness (1990, 151) explained, the manuscript is not lost; indeed two 
copies survive. Lagrange saw the manuscript in 1793, and the 
Lagrange-Charpit method is essentially his more rigorous account of 
what Charpit wrote, which Lagrange presented in his Lecons sur le 
calcul des fonctions ([177], Lecture 20).* 


8.5.1 Lacroix’s Traité (1798) 


Monge’s ideas formed the basis of the rather superficial account given 
by Lacroix, the great textbook writer of the period, in his Traité of 1798, 
but this indicates the difficult nature of the subject for even the best 
students of the time. He dealt with what we would call the quasi-linear 
first-order case and some examples of second-order equations. For the 
linear second-order partial differential equation 


A2xx + Dixy + CByy = VX,Y) 


with constant coefficients a, b, c he factorised the equation for the 
characteristics and proceeded formally to a solution without caring 
whether the roots of the equation 


p(xt f'(p)) = 9. 


were real or complex, but only that they were distinct. 
He noted that many equations escaped this analysis, notably the 
equation 


but that in this case the equation could be solved by an infinite series of 
sums of terms of the form 


2 
WAX. 
eens, 


Lacroix then commented on some remarks about the generality of the 
solution, noting that Laplace had been of the opinion that the equation 
could not be solved by an arbitrary function, but the Italian 
mathematician Pietro Paoli had disagreed. Evidently, Lacroix had little 
idea of the power inherent in series of that kind. Such was the situation 
when Fourier took up the problem, as we shall see in Chap. 10. 


8.6 Exercises 


if 


Explain why a line in space that is parallel to the (x, y)-plane has an 
equation of the form y = m(z)x + z (ignoring lines parallel to the x- 


axis). Eliminate m(z) by differentiating twice, and deduce that the 
equation of the surface generated by a line moving in space while 
remaining parallel to the (x, y)-plane is 


g’r — 2pqs + p*t = 0. 
What is the Monge cone for a quasi-linear first-order partial 
differential equation? 


Follow through Monge’s analysis for the first-order partial 
differential equations considered by Euler. 


Follow through Monge’s analysis for the wave equation and 
Laplace’s equation. 


Questions 


ile 


What does the near silence about the heat equation, even as a 
partial differential equation without any physical interpretation, 
say about the solution methods available around 1800? 


. To what extent does Monge’s geometrical analysis help you? How 


do you find it compares with the more formal account in Sect. 8.2.2 
above? 


How fair is it to say that Lagrange’s account of first- and second- 
order partial differential equations in two independent variables is 
Monge’s without the geometry? What advantages and 
disadvantages are there in the two approaches? 


Footnotes 
1 Recallthat ,_ 6%, ._ éz,and ,_ @:. 
r= Ox S = Oxdy t= ay? 


2 Lagrange also considered linear equations in more than two independent variables in this 
paper. This was the territory that no one else had mastered, and he indicated that his approach 
generalised, but it cannot be treated here. 


3 On Paul Charpit de Ville Coer and his manuscript, see Grattan-Guinness and Engelsman 
[122]. It seems that Charpit came from Strasbourg to Paris in 1782, and became an assistant to 
Monge, who taught him solid geometry. He read an extract of his paper to the Académie des 
Sciences on 30 June 1784, but nothing was done with it. Laplace acquired a copy, and he passed 
it on to Lagrange (who had been in Berlin in 1784) in 1793. Lagrange in due course sent it to 
Charpit’s friend Arbogast, who made a copy that is now in Florence. In 1798 Lacroix described 
Charpit’s paper in his Traité du calcul différentiel et du calcul intégral ({[168], 496-497, 513- 
516). Curiously, his copy of the manuscript is shorter than Arbogast’s and seems less reliable. 


4 Lagrange’s Lecons are in vol. 10 of his Oeuvres. Charpit’s name is not mentioned. 
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9.1 Revision and Assessment 1 


This chapter is given over to revision and discussion of the first 
assignment, see H.2 in Appendix H. 


9.1.1 Comments 


The Assessment asked students to reflect on what they had studied 
either by imagining becoming a student of mechanics and dynamics 
around 1770, or more generally, on what is involved around 1770 in the 
study of partial differential equations. 

They were to do so by writing a letter (some years as an English 
professor writing to a student, some years as the student writing back 
to their former professor). I asked for a letter, not a history, to help 
them get into the way of seeing things through the protagonists’ eyes. 
No one can see the future, and historians shouldn't try. And in fact, 
generally speaking, answers that began as a letter made it easier for the 
writer to engage with the developments historically. 

No comparison between the mechanics of Newton and Euler should 
diminish Newton’s remarkable achievements in his Principia 
Mathematica. His major work is a theory of celestial mechanics that 
delivered a remarkably accurate theory of the motion of the planets and 


their satellites, in which to a high level of detail, only the motion of the 
Moon around the Earth remained unaccounted for. 

Euler managed no comparable single achievement in this field, but 
he produced theories of motion for rigid bodies of any kind (Newton 
could only handle spheres, which he showed could be treated as 
points), and of fluids and gases. There is also his account of the 
vibrating string. 

Whereas Newton’s method was largely geometrical but 
intermittently invoked calculus-type series of arguments, starting from 
a set of three laws of motion, Euler always started with infinitesimal 
pieces of bodies and worked towards equations of motion expressed in 
calculus terms as differential equations. He reformulated Newton's laws 
of motion as equations of motion, and his calculus of variations was a 
completely novel approach to geometrical and mechanical questions, 
one rapidly improved by Lagrange. The central importance of 
differential equations, and the means of solving them, unite all of 
Euler’s work in this area. 

Perhaps nothing compares with the invention of (single-variable) 
calculus and the creation of celestial mechanics, but if anything does it 
might be Euler’s breadth of applications coupled with the generality of 
his (several-variable) calculus. But Newton’s ingenious methods were 
not what anyone could follow, and Euler’s were.! 

The origins of partial differential equations are the wave 
equation and its investigations by d'Alembert, Euler, Daniel Bernoulli, 
and Lagrange. This led to the idea that its solutions are any functions of 
the form a(x, y)dx + b(x, y)dy, but that led to deep questions about 


what is an arbitrary function. 

First-order linear partial differential equations were first studied by 
treating them like ordinary differential equations, so a good answer 
would go into a little mathematical detail about them because later 
methods were different. Note, for example, the advance in the theory of 
characteristics made by d'Alembert. 

The theory of second-order linear partial differential equations was 
much less developed. Success with the wave equation came with Euler’s 
worry about (what came to be called) the Laplace equation. 


There are other aspects worth mentioning: fluids, the propagation 
of sound, and, perhaps, the calculus of variations. 


Footnotes 


1 A good source is an essay by Maronne and Panza entitled Euler: Reader of Newton. See 
https://hal.archives-ouvertes.fr/hal-00415933. 
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10.1 Introduction 


Fourier series are infinite series of sines or cosines that are used to 
represent a function. They had been used by Euler and others in the 
eighteenth century in the study of the vibrating string and in celestial 
mechanics, but Fourier’s name is rightly attached to them because of 
the great generality and utility he envisaged for them and the success 
he put them to in the study of heat diffusion. His strong claims for them 
were also to prove a valuable challenge to mathematicians who came 
after him and demanded more rigour in analysis. 


10.2 Fourier and His Series 


Joseph Fourier (Fig. 10.1) was born in France in 1768, and for much of 
his life he was caught up in the political transformation of France. When 
he was an assistant lecturer at the Ecole Polytechnique in 1795 he came 
to the attention of Gaspard Monge, who became a prominent supporter 
of Napoleon, and who selected Fourier for the French expedition to 
Egypt in 1798. When that ended in defeat Fourier returned to France in 
1801, but Napoleon, impressed by his organisational talents, made him 


the prefect of Governor of the Department of Isére, and, still impressed, 
a Baron in 1808. 

The defeat of Napoleon led to a difficult period in Fourier’s life, as 
the new regime came down hard on those it could accuse of having 
supported either the revolution or Napoleon, but he recovered and with 
the support of Laplace he became the permanent secretary of the 
Académie des Sciences in 1822 and was elected to the Académie 
Francaise in 1827. In 1830, he died from complications of an illness 
caught in Egypt. 


Fig. 10.1 Joseph Fourier (1768-1830) by Amédée Félix Barthélemy Geille, after Jules Boilly, c. 
1823. Portraits et Histoire des Hommes Utiles, Collection de Cinquante Portraits, Société Montyon 
et Franklin, 1839-1840 


His book, Théorie analytique de la chaleur (The analytical theory of 
heat) came out in 1822.' It is devoted to the topic of heat diffusion.* By 
this time, it was reasonably well known that heat was a state of the hot 
material, not a fluid substance permeating the material in an amount 
proportional to its temperature. Not much else was understood, 
however, and accordingly, Fourier made as few assumptions as possible 
about the nature of heat. Rather, he concentrated on formulating the 
way heat passes from one part of the body to an adjacent part in a very 
short interval of time. He argued that it was enough to suppose that the 
amount of heat that passes is proportional to the duration of the time 


interval, the infinitesimal temperature difference between adjacent 
parts, and a certain function of the distance between the parts. He was 
able to examine homogeneous bodies of simple shapes with simple 
temperature distributions on their boundaries, say when the 
boundaries are kept at fixed temperatures. He also tested his results 
experimentally by heating simple shapes and measuring the 
temperature at various points and various times and found good 
agreement with his theoretical predictions.° 

One of his examples was a window—of an unusual shape, being 
infinite from side to side and top to bottom, but of finite thickness— 
kept at a fixed temperature on each side and warmer inside than out. In 
this case, the temperature drops linearly with the distance from the 
warmer side. Another of his examples was that of an oven. 

The problems that Fourier considered have two aspects. One 
concerns the flow of heat in the body, and he showed that that is 
described by a differential equation. The other concerned the 
temperature on the boundaries of the body, and although these can be 
arbitrary they are easiest to handle if the boundary is made of simple 
shapes. Only if the boundary conditions, as these specifications are 
called, are simple can explicit solutions be found. 

We now turn to one of his examples to see how he formulated the 
mathematical equation that describes the flow of heat. 

Fourier considered semi-infinite bars with either semi-circular or 
square cross sections, in which one end is kept hot and the rest 
uniformly cool. The problem is to find how hot the bar becomes when it 
reaches a steady state. He argued that at each point in the interior of 
the bar, heat—measured by the temperature, v, a function of x, y, z— 
passes through in each of the t = O-, and z-directions. In §98, he 


considered an infinitesimal cube in the interior of the bar and stated 
that the amount that enters the face with sides dx and dy is _Kq xdy& 


evaluated at that face, where K is a quantity determined by the nature 
of the body, and what leaves the opposite face is _Kq xdy& evaluated 


at that face.* The minus sign arises because heat flows from a hot body 
to a cold one. 


What left at the second face was found by replacing z by cos A., and 


so the difference between what enters and what leaves is the difference 
in the values of (a/2)2 <p at the two faces, which works out to be 


Kdxdydz ey. Because the temperature is in a steady state, the sum of 


these quantities taken over the three pairs of opposite faces of a cube is 
zero, and the resulting equation is 


av | ay By 
Ox? = dy? a) 


The distribution of heat in the body is described by the solution of this 
differential equation that satisfies the stated boundary conditions. 

A similar argument allowed Fourier to derive the equation for the 
distribution of heat in a body that is not in a steady state, so the 
temperature v = v(x, y, Z, t) is now a function both of position and time. 


= 0. (10.1) 


He now argued (§142) that, by regarding the body as made up of little 

cubes, the differential equation (10.2) will hold for the flow of heat in 

body. As before, the amount of heat leaving a cube in the z-direction 
IS _Kdxdyde v, but now the sum over the pairs of opposite faces is 


proportional to the rate of change of temperature, which is o The 


result (§128) is the heat equation: 
Ary 7 Ary > Ory _ Ov 
2 dy? 62) Ot 


As before, solutions of this partial differential equation are required 
that satisfy some given boundary conditions (and, if v is a function of ¢, 
initial conditions). These conditions, however, generally forced him to 
suppose that the body has a simple shape, such as a cuboid, or one of a 
limited range of other shapes that can be handled by finding suitable 
coordinate transformations. 


(10.2) 


To show how this could be done, Fourier had first to show how to 
find a general solution to the partial differential equation, and then 
show how to fit the general solution to the boundary conditions. He 
began (§166) with the simplest two-dimensional case, a semi-infinite 
strip of a given width, J in suitable units, located between (z, 0) and 


(zr, 0), included between two parallel infinite sides at a given 


temperature 0 in some units which at its base has a temperature of 1. 
He chose coordinates (here relabelled to be more familiar) in which y 
measures the height above the base and x measures the distance of a 
point from the mid-line of the strip. The differential equation of the 
steady-state distribution, which now involves only two variables, is 


Ov A*y 
K Ox + dy? = 0, (10.3) 


It is likely he was inspired by existing treatments of the wave equation. 
At all events, he looked for a solution of the form 


v(x, y) = f(x)g(y). 
This forces 
8° O)/8) = f° (O/ FO), (10.4) 
so both sides must be constant, say 7”, and the solutions are of the 


form 

J) =cosmx,. 26) =e": (10.5) 
The boundary conditions force m to be positive, otherwise e’”” would 
become infinitely great, and if the solution is to vanish for x = +77/2 for 


all y, then m must be an odd integer. 
This led Fourier to contemplate solutions of the form, as he wrote 
(§169), 
(10.6) 


—3y —5y 


ae cos x + be~*” cos 3x + ce cos 5x + de” cos 7x + etc. 


subject to the boundary condition at the base that 


1 =acosx + bcos3x +ccos5x + dcos 7x + etc. (10.7) 


The infinitely many arbitrary constants are now to be determined from 
the boundary conditions. Fourier gave two methods. The first is an 
impressive tour de force, but the second one is much easier and has 
been used ever since, although with much more attention to the 
conditions under which it is valid. 

Fourier argued as follows. We have f(x) = 1, -a7/2 < x <2/2.The 


corresponding Fourier series is }>),-1 Gn COS NX, SO 


fe > An COS NX, 


n=1 


so, multiplying with sides by cos jx and integrating 


m [2 mt |'2 
{ cos Jxdx = i: Ayn COSNX COS Jx | dx. 
—7/2 —7/2 n=1 


The left-hand side is 


1 2 
; nga = ; sin( jx/2). 


The quantity sin(jz/2) is zero when j is even, itis +1 when jis of the 
form 4k + 1, and itis +1 whenj is of the form 4k + 1. So the left hand 
side is zero ifj is even, itis 2 ifjis ofthe form 4k + 1, anditis —2 ifjis 
j j 
of the form 4k + 1. 
The right-hand side is 


aj5> 


so V, is is zero if jis even, itis 4 ifjis ofthe form 4k + 1, anditis —4 
jn jin 
ifj is of the form 4k + 1. 


This gives the result 


4 il | 
1 = -—|cosx-— —cos3x+—cos5x+-:::], 
14 3 5 


as Fourier said. 
It gave him that for all values of y between (z, 0) and (z, 0) 


ay iy 


1 i 1 
e~cosx - 36 608 3x + aad cos 5x — se” COs 7x +etc., (10.8) 
where on the boundary” 
1 a Sar 5 
= —|cosx —=cos3x+—=cos5x-...]. 
: : ; (10.9) 


Here is a graph of the sum of the first 105 terms of the series for the 
function F(x) = +7/4 (Fig. 10.2): 


1 1 1 
cos(x) — 3 cos(3x) + 5 cos(5x) — 7 cos(7x) +-:°. 


Fig. 10.2 The sum of the first 105 terms of the Fourier series for F(x) = +7/4 


Note how small the difference between the sum of the series and 
the sum of its first 105 terms has become. 

To reach these conclusions, he had observed in §220 that when 
z = | (the switch from cosine to sine is irrelevant here) 


Tv 


m ie 
{ sin jx sinkxdx = = | -sin(k — s)x - sin(k + is] 
0 2\h= J 


= 0,(10. 
= (10.10) 


and that the integral is 7/2 when z = |. He accordingly deduced that 


the coefficients of the series can be found by multiplying the series by 
sin jx, for each value of j, and integrating. For, if 


1 [o-e) 
x)= 7% + » b, sinnx, 


n=1 


then multiplying both sides by sin jx and integrating gives 


(oe) 


TT TT 1 TT 
{ f(x) sin jxdx = Hi ~ag sin jxdx + » b, sinnx sin jxdx. 
0 0 2 Oey 


This, he simply assumed, is equal to 


m1 £9 TT 
‘p ~do sin jxdx + Ds ii by, sinnx sin jxdx. 
0 2 a] v0 


In this expression, all the terms ih " b, sinnx sin jxdx Vanish except for 
0 


the one in which n = j, and this one is equal to 4. So 


7T 
{ f(x) sin jxdx = 1b, 
—1 
and therefore, 
1 TT 
b,=- { f(x) sin jxdx. 
TM Jn 


Similar results apply to series of cosines, to series of sines and cosines, 
and to series obtained when the period is different (such as J or 1). For 
example, changing the range of integration from (—7/2, 7/2) to g = bp 


, Which is much more convenient for later use—when Zz = | 


Tv 


‘ 1 ame 1 
i sin jx sinkxdx = = sin(k — j)x — sin(k + ox} = 0,(10.11) 


_ 2\k-j k+j 


TE 


and that the integral is . when z = 1. 


He went on to claim that any function f defined on the interval 
gq = bp can be written as an infinite series of sines and cosines in any of 


these forms, depending on how they are to be continued outside the 
interval on which they have been defined: 


= : mixed series 
T(t) = +d + >) a,cosnx + b, sinnx ( J 


n=1 
[oe) 

fxs +o + }) a, cosnx 
n=1 


(cosine series, which works only for 


even functions) 


f@= +a + xs b,, sinnx (sine series, which works only for odd 
n=1 


functions) 


The coefficients of the mixed series are given by the formulae 


ag =+[" f(x)dxand 


L: | {* 
a= -{ f(x)coskxdx, be= -{ f(x) sin kxdx. (10.12) 
KR Sz ae ae 


The result is that the heat equation can be solved, for bodies with 
simple shapes, by the method of Fourier series, and the answer is 
written as an infinite series of either sines, cosines, or both. The 
solution of the partial differential equation appears as an infinite series 
with largely arbitrary coefficients. The boundary conditions allow the 
coefficients to be determined, and a unique solution is exhibited. 

Or rather, we should say, Fourier claimed that this could be done. 
Very quickly his claim became a challenge to mathematicians to prove 
that it is correct—and this was to become a long and fascinating story 
that we can only begin to tell here. 

Fourier’s ideas generated quite some discussion in print. Siméon 
Denis Poisson rightly complained that Fourier’s methods for finding the 
coefficients in a Fourier series® “has not in fact been demonstrated in a 
precise and rigorous manner’. A decade later he objected (again 
correctly) that the fundamental assumption that an arbitrary function 
can be expanded as an infinite series of sines and cosines had not been 
proved, and Charles-Francois Sturm joined in, remarking that 
“Fourier and other geometers seem to have misunderstood the 
importance and the difficulty of this problem, which they have confused 


with that of determining the coefficients”.’ 


10.2.1 Dirichlet on the Convergence of Fourier Series 


Peter Gustav Lejeune Dirichlet, who had got to know Fourier personally 
during a long stay in Paris, took up the subject of the convergence of 
Fourier series in the late 1820s, by which time it was a topic of some 
discussion, although Dirichlet said that he knew of no other attempt on 
the problem than Cauchy’s, which he found to be flawed.° In his opinion 
it was remarkable that a Fourier series expansion of an arbitrary 
function converges (which suggests that he doubted neither the 
existence nor the convergence of the series), and so he proposed to 
establish the convergence of a Fourier series directly, and to show that 
the series and the function agree. He succeeded, by a very fine 
application of Cauchy’s own x + y analysis and the theory of 


convergence, in showing that the Fourier series representation of a 
function that is piecewise continuous and piecewise monotonic on an 
interval converges and agrees with the function except at the point 
where the function jumps. At points where 


lim f(x)=a@#f6= lim f(x) 
Xa x7at+ 
the Fourier series takes the value +(a + B). 


Dirichlet’s rigorous argument made it clear that to prove Fourier’s 
claim in any greater degree of generality would be hard work, and 
indeed later generations of mathematicians would discover that 
Fourier’s claim is in fact false in general and applies only to functions 
that do not oscillate wildly. 


10.2.2 Fourier Integrals 


Fourier also considered how heat diffused in an infinite bar under two 
distinct conditions. In the first, a part of the bar is raised to a 
temperature given by a function F(x), the rest being at temperature 
zero. In the second, one end of the bar is kept at a constant 
temperature. 

He began (§345) by supposing that bar is an infinite line, modelled 
by the positive real axis {x : 0 < x}, and the heated region is the 


interval [0, 1], the temperature being given by a function F(x) on 


da = (). He then supposed that the whole line is considered, and the 
data extended to the negative real axis by defining /(—x) = F(x). The 


problem is to be solved by a function v(x, f). 
The equation to be solved is 
OV ov 
Ot Ox? 
where h and k are constants determined by the physical properties of 
the wire. He set » = e~”"y and reduced the equation to 


hy, 


Ou Oru 
Ot Ox?” 
The equation is solved, for example, by 


—kaq? 
u=acos(gxje“"', 


where a and q are arbitrary constants and therefore, Fourier supposed, 
by an infinite sum of such expressions: 


—kq?t 
u = >) 4; 00s(qjx)e i, 
J 


By supposing the successive qs vary only a little, this led him to look for 
a solution of the form 


Wx%1) = { a f(q) cos(qxye* "dq, 
0 


where f(q) is an as-yet unknown function of g that can be found from 
the initial data, because 


u(x, 0) = F(x). 


This led him to this equation for f(q): 


F(x) = { f(q@) cos(gx)dq, 


which he called (§346) “a remarkable problem whose solution 


demands attentive examination’”.” 


He solved this equation by going back to thinking of the solution as 
an infinite series and using his old method for integrating in a way that 
picked up a single term of the series. This led him to the solution 


2 CO 
I(Q=- { F(x) cos qxdx. 
M Jo 
Therefore, the solution to the heat diffusion problem in this case is 
2 lee) lee) 
u(x,t) = — { ( { F(x) cos andr) cos(gx)e™! ‘dq. 
X Jo 0 


Fourier then (§348) considered the special case when k = 27/€ when 


—1 <x <1and k = 27/€ otherwise. The x-integral becomes 


Here, Fourier remarked that the discontinuous function F(x) has been 
expressed by a definite integral. 

To deal with a semi-infinite bar heated at one end, Fourier found it 
convenient (§351) to think of an infinite bar heated in the middle bya 
function F(x) for which (x + dx, y + dy). In this case, he found the 


solution to be 
2 [” ae 2 
u(x,t) = — { | { F(a) sin(qada em sin(gx)dq. 
T J0 0 


At the risk of being somewhat arbitrary, I add this remark by Fourier 
(§358): 


We might deduce also from the transformation of series into 
integrals the properties of the two expressions 


a cos gxdq and = [ sin gxgdq_ 
mJo 1l+q’ rRJo 1t+q ’ 


the first (Art. 350) is equivalent to e-* when x is positive, and to 
v, when x is negative. The second is equivalent to e-* when x is 
positive, and to —e* when x is negative, so the two integrals have 


the same value, when x is positive, and have values of contrary 
sign when x is negative. 


10.3 The Analysis of Fourier Integrals 


Fourier was as over-confident here as he had been when dealing with 
infinite series—or perhaps we should say that rising standards of 
rigour were to catch up with him. Rather than trace the history of 
analyses of this aspect of his work, I shall leap to a satisfactory solution 
from just over a century later by the American mathematician A.G. 
Webster that indicates what had to be done.” 

We begin with the Fourier series for a function defined on the 
interval [—J/, /]: 


1 [oe] . . 
i= 7% “+ 2d faseoe= + bjsin)., 


where 


Lf Loe 
aj=7 | f(syoos Fas b= | fesysin “as, 


and we suppose that the function fis such that these last two integrals 
converge. The question is: what happens as 6x = 0? 


Note first that if f(s) is absolutely integrable on the whole real line 
then v; and 1, both tend to zero as 6x = 0. 


We define 


j=m 
S(,m, x) = it f(s) eo i: 2, cos Es Z 7 ds 


and consider its value as both / and m tend to infinity. This value 
depends on the order in which we take these limits. 

Now, a detailed argument (here omitted, see Webster [265], 154) 
leads to the conclusion that if the function fis continuous at x then 


prea 


FQ) = lim a ([ cos(t(s — sai f(s)ds, 
—oo 0 


where f = x + iz. So we require that 0 < s < 1, and so we let m — oo 


first. 
It then seems natural to suppose that 


lim ‘ ([" cos(t(s — sai Hass -{ Pe cos(t(s — “a f(s)ds. 
poo Tt )_o\ Jo Teco y 


Indeed, this is the form in which Fourier gave it. But this, to quote 
Webster (p. 155) “makes no sense” because the integral 


{ : Ger 
0 aX 


does not tend to a limit as N = 3n but instead oscillates. 


Instead, as Webster pointed out (see pp. 155-156), it is possible to 
switch the order of integration—I omit the argument—and to deduce 
that the correct value of the Fourier integral is 


-{ ( fs) eosi(s = x)ids dt (10.13) 
@) —0oo 


Webster at this point quoted Kronecker (Vorlesungen uber die Theorie 
der einfachen und vielfachen Integrale, 81) to indicate the continuing 
importance of Fourier’s result: 


This so-called Fourier double-integral made at its discovery a 
tremendous impression on the mathematical world. It was 
shown for the first time how an almost arbitrary function, 
satisfying only the limitations mentioned, fits itself into 
mathematical forms. The formula (10.13) maintains its 
correctness, as was shown by P. du Bois-Reymond, for various 
fluctuating functions, inserted instead of the cosine. 


10.4 Stokes and Laplace Transform 


In the 1860s, as we shall see more fully in Chap. 20, Thomson and 
Stokes used the heat equation to successfully describe the transmission 
of electricity down a wire. In the course of that work, Stokes used a 
Fourier integral approach which we shall now examine.!! 

Stokes treated the problem as being one of heat diffusion down a 
semi-infinite wire (x = 0) with an arbitrary initial distribution of heat 


(or electricity) along with it. The case of heat (or electricity) 
concentrated at the end point x = 0 is of particular interest because it 


corresponds to an impulse at that end. 
He wrote the equation for the temperature v(x, t) as 


Ov 7 Orv 
Ot Ox’ 
with the conditions that 
dz = p(dx + Vdy) + Udy, 


He then looked for a solution in the form 
v(x, t) = { u(a, t) sinaxda. (10.14) 
0 


First, differentiation with respect to t under the integral sign gives 


ae { es) sin axda 
Ot Jo Ot 


But, he observed, differentiation under the integral sign with respect to 
x does not work, because v does not vanish when x = OQ, and itis 


necessary to add the term 
2 2 
—av(0, t) = -a f(t), 
1 1 


as he had explained in a paper he had published earlier. 


It follows that 
2 fe) 
ov = Hi —a f(t) - a u| sinaxda. 
Ox? 90 \x 
Hence, 
0 0 10 ie 2 
os a3 os = i ( te dx ~ af) + u| sin axda, 
and so 
O t) 2 
uo). = “af(t) — au. 
Ot 1 


This is an ordinary differential equation whose solution was known. It 
is 


2 : , 
u(a@, t) = few { a F(ere! dt’ , 
1 


where the constant of integration has been expressed as an arbitrary 
lower end point of the integral. 
The initial condition u = u(x, y) implies that 


2 ; / 
u(a@, t) = al fete dt’. 
0 


So Stokes wrote that the temperature v is given by 
2 “ , +07(t-1’) o: , 
v=- f(t)ae sinaxdadt . 
TJ) JO 
Equation 10.14 implies that this means 


a) fore) 3 t ; 
yor { [- : { af(t)e™" ar sin axdx. 
T Jo 


He now switched the order of integration, so that the first integral to be 
evaluated is the @ integral, which is of the form 


[ee) 
ho 
{ e “" sin baada. 
0 


This is a derivative of a known integral: 


= 2 Ll (m\!/2 19 
{ e “* cos bada = = (=) ore 
0 2 a 


a: d1(R\" 233 
4" on haada = -—— (=) —b* /4a 
{ é sin DAWA db? - e 


SO 


br” 524 
~ 4a3i2 


Whence, as Stokes said, writing ¢ — ¢’ for a and x for b, 


t 
x \- x2 , , , 
v(x, t) = Alle { (t = ) a? 6 A(t =f f(t )dt . 


Three comments are in order. First, switching the order of integration is 
impermissible, although the conclusion remains correct. 
Second, as we shall see in Chap. 20, the x2 term is critical. 


Third, the known integral is an example of a Laplace transform, 
which we now proceed briefly to discuss. 

Ingenuity with integrals was a necessary skill of the eighteenth- 
century mathematician, and Euler and others discovered many clever 
results without being encumbered by rigour. One of Laplace’s 
contributions in this line was the study of integrals of the form 


{ : e" f(t)dt, 
) 


which are today called the Laplace transform of the function /(¢). 
When the function f(t) is a power of t the Laplace transform can be 
found recursively, starting with f(t) = t. When f(t) is an exponential it 


is elementary to find the transform, and in this way the Laplace 
transforms of the trigonometric and hyperbolic functions are found. 
More complicated integrals, such as the one Stokes used, require 
ingenuity and were the stock in trade of mathematicians of the day (as 
they still are for many kinds of engineer). 

The Laplace integral 


{ e* dx = vx, 


(oe) 


= 2 
{ e dx = ~vz/a, 
—0oo 


is one of the pleasures of contour integration in elementary complex 
function theory. I leave that for you to find, but this anecdote about it is 
irresistible. !7 

When Thomson was young he went to Paris where he formed a high 
opinion of Joseph Liouville. He asked him about this integral, and 
Liouville immediately evaluated it. This mightily impressed Thomson, 
and 


which implies that 


Once when lecturing he [Thomson] used the word 
“mathematician” and then interrupting himself asked the class: 


“Do you know what a mathematician is?”. Stepping to the 
blackboard he wrote upon it: 


i e* dx = Vr, 


[oe] 


Then, putting his finger on what he had written, he turned to his 
class and said: “A mathematician is one to whom that is as 
obvious as that twice two makes four is to you. Liouville was a 
mathematician.” Then he resumed his lecture. 


10.5 Exercises 


Ae 


Find Fourier series expansions of some very simple functions, such 
as = x+ ayand = x + ay ona suitable interval. 


Confirm Liouville’s claim that 


i e* dx = Vr, 


co 


Fourier’s first method for finding Fourier coefficients invoked 
Wallis’s series for / as an infinite product. Find out what this is and 


how Fourier used it. 


Questions 


1. 


Euler said that the heat equation could only be solved with great 
effort. Fourier tackled it with methods drawn from the theory of 
the vibrating string. What does that say about methods for solving 
partial differential equations around 1800? 


The eighteenth-century debate about arbitrary solutions to the 
wave equation and their representation by infinite series of sines 
and cosines was inconclusive, but the nineteenth-century debate 
was hugely productive. Why do you think this was? 


Footnotes 


1 It was written in several stages and has a complicated publication history that 

Fourier described in its Preliminary Discourse. In 1812, a version of his theory won a prize of 
the Institut de France in Paris Academy, but Lacroix and Laplace were unable to overcome the 
objections of Lagrange and let Fourier’s account be published. Lagrange accepted that the 
correct equation for heat diffusion had been found, but not the generality of the solutions. Fora 
rich introduction to Fourier and his work, see Grattan-Guinness and Ravetz [123]. 


2 See the translation by A. Freeman in the Internet Archive. 


3 See Grattan-Guinness and Ravetz ([123], 421-440) for Fourier’s account of 1807. 


4 I set side Fourier’s remarks about what physical properties affect K. 


5 Note that we have a series of continuous (indeed, analytic) functions that defines a function 
that is plainly not continuous. 


6 Poisson ([226], 46), quoted in Bottazzini ([21], 188). 


7 See Poisson ([227], 186) and Sturm ([250], 400). 


8 See Dirichlet ([63] and, for a historical account in keeping with the present book, [127]). 


9 We recognise this as the introduction of “Fourier transforms”, whose properties Fourier did 
indeed begin to study. 


10 See Webster ([265], 153-156). 


11 Itis well worth consulting https://www.math.ubc.ca/~feldman/m267/pdeft.pdf for an 
account of how the Fourier transform can be applied to partial differential equations, including 


the wave equation and the telegraphist’s equation. The same method applied to the heat 


equation is described in http://web.math.ucsb.edu/~helena/teaching/math124b/heat.pdf. See 
also Appendix D. 


12 From Thompson [254], 1139 quoted in Liitzen [192], 146. The author here is Sylvanus P. 
Thompson, in his biography of Thomson. 
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11.1 Introduction 


The hypergeometric equation is arguably the richest example of a 
linear ordinary differential equation with polynomial functions as 
coefficients. It has deep roots in the study of elliptic integrals, and its 
study throughout the nineteenth century was to be promoted by Gauss, 
Riemann, and others.! 


11.2 Elliptic Integrals 


The simplest, paradigmatic, elliptic integral is 


= (11.1) 


It measures arc length along the lemniscate +? = cos 29, which is a 


curve in the shape ofa figure eight. 


The Italian mathematician Count Fagnano had found some 
interesting results about this integral in 1714, and after Fagnano 
submitted his life’s work, the Produzioni [101] to the Berlin Academy 
they were sent to Euler, who greatly extended them in the early 1750s.” 
Even so, it was becoming an embarrassment that the integral could not 
be expanded as a function of its upper endpoint, except as a power 
series that revealed no significant properties of the integral. 

Euler’s work on many topics was a great stimulus to Adrien-Marie 
Legendre, and he took up the challenge of including elliptic integrals in 
an extended theory of functions. He concentrated on the integrals 


Xx 


ror= fg and BC) = a 


where A — fa ~ 2)(1 — c2#2) which he called integrals of the first 


and second kinds, respectively. He called the parameter c the modulus 
and required it to be real. He was also interested in the corresponding 
complete integrals | jy, and | _,whichhedenoted F' and £!, 

0 A 0 


respectively, and as F!(c) and F'!(c) when he wanted to think of them 


as functions of the modulus. 

Legendre’s three-volume Exercises de calcul intégral [184] draws 
together his life’s work on the subject. Among the many results it 
contains was one that showed that the complete elliptic integrals 
satisfy linear differential equations as functions of the modulus”: 


ay 1 — 3c? dF! 

(hc De +—— . A -F'=0 (11.2) 
aa 1-—c*dE' 

Uae) ai celine y 5) (11.3) 


He solved these equations by the method of undetermined coefficients 
and so obtained power series expansions for the complete integrals 
F'(c) and F!(c). He also established a strikingly attractive result 


connecting complete integrals of the first two kinds with 
complementary moduli (cand , = \/] — -2): 


. = F!(c)E\(b) + F'(b)E'(©€) — F\(D)F (0). (11.4) 


Legendre was keen to show that his new functions would be useful. He 
discussed at length how to calculate table of values for them, and then 
investigated three problems in detail: the rotation of a solid about a 
fixed point; the motion (either in a plane or in space) of a body 
attracted to two fixed bodies; and the gravitational attraction due to an 
homogeneous ellipsoid. In the first volume of his Traité (1828), he 
further investigated motion under central forces, the surface area of 
oblique cones, the surface area of ellipsoids, and the problem of 
determining geodesics on an ellipsoid. But for all this work, these 
integrals did not reveal their most fundamental properties to him, as 
we shall now see. 


Fig. 11.1 Car] Friedrich Gauss (1777-1855) by Christian Albrecht Jensen, 1840 


11.3 Gauss 


Gauss (Fig. 11.1) was arguably the first mathematician to leave the 
circumscribed eighteenth-century domain of functions given by explicit 
expressions and moves with ease into the large class of functions 
known only indirectly through some prescribed property. 

This move confronts all who take it with the question: when is a 
function “known”? One answer is to develop a theory of functions in 
terms of some characteristic traits that can be used to mark certain 
functions out as having particular properties: they are periodic, or they 
have no zeros, for example. A second answer sidesteps the question and 
regards the inter-relation of functions given in power series as itself the 
answer, which is what Gauss did in his study of the hypergeometric 
series. Most mathematicians adopted a mixture of the two approaches 
depending on their own success with a given problem. Later, 
Weierstrass and his followers in the Berlin school based their theory of 


functions on the study of series. On the other hand, Riemann and later 
workers, chiefly Klein and Poincaré, sought more geometric answers. 

Gauss was a brilliantly gifted mathematician born at an unusual 
time. In 1801, the year Gauss became famous at the age of 24, 
Lagrange was 64, Laplace 51, Legendre 48, and Monge 54. Contact with 
them, and the younger generation of Cauchy and Fourier, would have 
been difficult for Gauss because of the Napoleonic war, and perhaps 
distasteful, given his conservative disposition. His teachers, Pfaff (then 
35) and Kaestner (81), were not of the first rank, and nor were his 
contemporaries Bartels and Farkas Bolyai. By the time the next 
generation of young mathematicians emerged (Jacobi and Abel, for 
example) Gauss had become confirmed in a lifelong avoidance of 
mathematicians, and was closer to German astronomers, notably 
Bessel, in whose subject he worked increasingly. 

Gauss left several of best discoveries unpublished, and they only 
became known with the publication of his collected works after his 
death in 1855. His sympathy for the work of Janos Bolyai and 
Lobachevskii, when it was revealed, helped change attitudes to non- 
Euclidean geometry, and his work on elliptic functions and his work on 
the hypergeometric equation and the hypergeometric serieswere also 
revelatory. To introduce it, we must make a digression and consider 
what is called the arithmetico-geometric mean (agm). 

Gauss discovered the agm for himself when he was 15. It is defined 
as follows for positive numbers qd, and dj. Set q, = +(ao + bo), their 


arithmetic mean, and }, = -Vapbo, their geometric mean. The iteration 
of this process, defining 
] 
Qns1 = 5 (an ots Dn) and bya = VanPn, 


produces two sequences (a,) and (a,) that converge to the same limit, 


a, called the agm of a; and a,. Convergence follows from the 


inequality Qn+1 — Dra < $(An ~ Dy): 


Gauss denoted the agm of a and b by M(a, b). Plainly 
M(Aa, Ab) = AM(a, b) and Gauss considered various functions of the 


form M(1, x). For example, 
M(1,1+x)=M0+ ; V1 + 2), 
so setting x = 27 + ¢* he obtained power series expansions with 


undetermined coefficients for M in terms of x and then in terms of t, 
from which the coefficients could be calculated. They display no 
particular pattern, but various manipulations led Gauss to this dramatic 
series for a reciprocal of M: 


1 9 25 
MC l-x!=1 —— — 
y=M1 +x, x) +o" aria teat +: 


2 2 2 
Li ploy ca. flee) 2 
Lele) x +1 + |—— +:-- 
(5) ° (55 ‘. Ga, as 
As a function of x, y satisfies the differential equation 


d’y 
(x -— x oS + Gx 2-2 tay= 0, 


which is Legendre’s equation (11.2). Gauss also found another, linearly 
independent, solution (1, x)!. 


The substitution ,* = z turns the equation for M(1 + x, 1 — x)! 


into this example of the hypergeometric equation 


ay 
1 
rd 2+ (1- 292 Fy = 0 


as we Shall now explain. It is another form of Legendre’s equation. 


11.3.1 The Hypergeometric Equation 


Gauss published only his study of the hypergeometric series, [114] in 
1812. The second part, on the hypergeometric equation [115], which is 
the differential equation satisfied by the hypergeometric series, was 
found among the extensive Nachlass and follows on from the first in 
numbered paragraphs (Sects. 38-57). 

The published paper is not remarkable by Gauss’s standards, 
although it considers x as a complex variable, and contains the earliest 
rigorous argument for the convergence of a power series and a study of 
the behaviour of the function at a point on the boundary of the circle of 
convergence, as well as a thorough examination of continued fraction 
expansions for certain quotients of hypergeometric functions. Part two 
is given over to finding several solutions of the hypergeometric 
equation and the relationships between them, and is of more interest to 
us here. 

In the first part, Gauss observed that the series 


a8 eat BEtV 2, 
in Woy 


where a,(,and A are real numbers, is a polynomial if either a — | or 


F(a, B,y,x)=1+ 


( — 1 is a negative integer, and is not defined at all if 1 is a negative 


integer or zero (this case he excluded). In all other cases, the ratio test 
shows that the series is convergent for x = a + bi whenever 


a + b? < |. Itis striking that Gauss was willing to introduce a new 


function as a complex-valued functions of a complex variable. 

He gave, following Pfaff [209], a list of functions which can be 
represented by means of hypergeometric functions. For example, 
iE 
a 
the trigonometric functions can now be obtained. Gauss then 


introduced the idea of contiguous functions (Sect. 1, Sect. 7): 
ad — bc # 0 is contiguous to any of the six functions 


a= lim PCRS, 


F(~a+1,6+1,y + 1,x) obtained from it by increasing or decreasing 


one coefficient by 1. He obtained 15 equations connecting ad — bc # 0 


with each of the 15 pairs of its different contiguous functions by 
systematically permuting the as, ns, As, etc. and comparing 


coefficients. As an illustration, here is the first of these equations: 
(y -2a -(B-a)x)F(a, 8, y,x) +a -x)F(a+1,B,y,x) -(y-a)F(a-1,8,y, x) = 0. 


As Felix Klein was to remark ([157], 16) these establish that any three 
contiguous functions satisfy a linear relationship with rational 
functions for coefficients. As a result, there are linear relationships over 
the rational functions between any three functions of the form 
F(~+m,p+n,y + p, x), where m, n, and pare integers. Gauss then 


showed that how to use contiguous functions to provide continued 
fraction expansions of quotients of hypergeometric functions, e.g. 


F(a, 8 Bla PX) 
F(@,B,y,x) ” 


and hence for several familiar elementary functions. Observe, as 
Gauss did at the start of the second paper, that ad — bc # 0 and 


dF (a, ,y, x) are contiguous in the obvious generalised sense, the 


dx 


relationship between them being, essentially, the differential equation 
itself. This is because 


d 7 
—F(a;p,¥,x) = OP ey PAB ley x): 
dx Y 
In the third and final sections of the published paper, Gauss considered 


the question of the value of ad — bc # 0 , i.e. of lim,_,; F(a, B, y, x) , for 


real a, 6, y. He then defined 


| eee GS 


| GED + De 


where k is a positive integer, and II(z) = limy_,.. H(k, z), which may be 


called (Gauss’s) factorial function and is his version of the Gamma 
function. The limit certainly exists for k = Na/€ and II satisfies the 


functional equation II(z + 1) = (z + 1)II(z) with II(O) = 1, from which 
it follows that II(n) = n! for positive integral n. I] is infinite at all 


negative integers.‘ The factorial function enabled Gauss to obtain many 
results that earlier mathematicians had obtained only with great effort. 
As Gauss puts it: “Whence many relations, which the illustrious 
Euler could only get with difficulty, fall out at once”. 

Gauss began the second and unpublished part of the paper, 
“Determinatio series nostrae per Aequationem Differentialem Secundi 
Ordinis”, by observing that P = F(a, 6, y, x) is a solution of the 


hypergeometric equation: 


2 


zd - 5 £ (a +8 + DIN — aw = 0. 


dx 

To find a second linearly independent solution he set x + y = 0, when 
the equation becomes the first equation with A replaced by 

P = 2x. q = Zy- It, therefore, has a solution F(a,fB,a+6+1-—y,1-—x), 
and the differential equation, in general, has solutions of the form 


MF(a,B,y,x)+NF(a,8,a+B+1—-—y,1-—), (11.5) 


where M and N are constants. 

Other solutions may arise which do not at first appear to be of this 
type, but, he remarked, any three solutions must satisfy a linear 
relationship with constant coefficients. This fact was of most use to him 
when transforming the differential equation by means of a change of 
variable. 

The substitutions he considered are of two types: the following 
transformations of x: 


and these transformations of P: 
dz = p(dx + Vdy) + Udy, 


for particular values of 2. These gave him several solutions to the 
original equation in terms of functions like /’(—, —, —, x) and 

F(-, -, —, 1 — x) etc.—where the blanks stand for expressions in a, n, 
and A—possibly multiplied by powers of x and z = 1, and also some 


linear identities between triples of such solutions. 
The paper concludes with a discussion of certain special cases that 
can arise when a, n,and J are not independent, for example, when 


£B=a+1-y,and the quadratic change of variable gq? + h? < | canbe 
made. 


Gauss made a very interesting observation at this point. The 
equation has as one solution in this case: 


1 1 
F(a, p,a+B+ wy) = FQa,26,a+ B+ 579): 


If, he said, yis replaced by x + y this produces 


1 1 
F(a,p,a+6h+ mee. = FQa,26,a+ B+ pe 


as we can see by looking at the basis exhibited in Eq. (11.5) above, and 
we are led to the seeming paradox 


1 1 
FQa,28,a+ B+ 59) = FQa,26,a+ B+ 5° 1 -y). 
“which equation is certainly false” (Sect. 55). 


To resolve the paradox he distinguished between the sign F when it 
stood for a function that satisfies the hypergeometric equation, and 


when the sign F stood for the sum of an infinite series. The sum is only 
defined within its circle of convergence, but the function is to be 
understood for all values of its fourth term that have been obtained by 
continuous change, whether real or imaginary, provided the values 0 
and 1 are avoided. However, this “function” may be many-valued, and it 
is in this case. 

This being so, he argued that one would no more be misled than one 
would infer from arcsin 4 = 3()° and arcsin 4 = 150° that 30° = 150°, 


the reason being that a (many-valued) function such as arcsin may 


have different values even though its variable has taken the same value, 
whereas a Series may not. 

Gauss here confronted the question of analytically continuing a 
function outside its circle of convergence. It was his view that the 
solutions of the differential equation exist everywhere but at 0, 1, (and 
oo, although he avoided the expression), whereas their representation 
in power Series is a local question. However, if the function is a many- 
valued function then the series expression may not be recaptured if the 
variable is taken continuously along some path and restored to its 
original value, and neglect of this fact can lead to absurd expressions 
like the one Gauss produced. 

Because he here talked of continuous change in the variable in the 
complex number plane, one may thus infer that Gauss here was truly 
discussing analytic continuation, and not merely the plurality of series 
solutions at a given point. 

In these papers, Gauss introduced a large class of functions of a 
complex variable that were defined by the hypergeometric equation 
and were capable of various expressions in series. The main direction 
of his research was in studying relationships between the series, which 
in turn provided information about the nature of the functions under 
consideration. 

The second part of Gauss’s paper on the hypergeometric series 
raises two main types of question. First, it would be useful to have a 
systematic account of the solutions obtained by the various 
substitutions, and of the nature of the substitutions themselves. Second, 
it would be instructive to connect the hypergeometric functions with 


the newer functions in analysis, especially in complex analysis, such as 
the elliptic functions. It is striking that Kummer in his paper [167] set 
himself both these tasks and resolved them while, moreover, observing 
Gauss’s restrictions where the work would otherwise be too difficult 
(for example, by considering only real coefficients). 


11.4 Kummer and His 24 Solutions 


Ernst Eduard Kummer had studied Mathematics at the University of 
Halle, and in 1836, when he published his paper on the hypergeometric 
equation, he was 26 and a Lecturer at the Liegnitz Gymnasium. 
Although he never attended a lecture by Dirichlet, he considered him to 
have been his real teacher, which is an indication of Dirichlet’s great 
influence on mathematics in Germany, an influence that was then 
extended to Kummer’s best student at Liegnitz, Leopold Kronecker. 
Kummer, with Kronecker and Weierstrass, went on to dominate the 
Berlin school of mathematics from 1856 until Kummer retired in 1883. 
His students regarded him as a gifted teacher and organiser of 
seminars, and he was diligent in his concern for them. He was also a 
man of great charm, and he had a great appetite for administration, 
being Dean of the University of Berlin twice, Rector once, and Perpetual 
Secretary of the physics-mathematics section of the Berlin Academy 
from 1863 to 1878. 

At the start of his long paper (1836), Kummer remarked of Gauss’s 
paper that 


But this work is only the first part of a greater work as yet 
unpublished, and wants comparison of hypergeometric series in 
which the last element x is different. This will therefore be the 
principal purpose of the present work; the numerical 
application of the discovered formulae will preferably be made 
to elliptic transcendents, to which in great part the general 
series corresponds. 


The hypergeometric equation has three singular points, 
x =0,x = 1,x = ov, and ina neighbourhood of these points one 


expects the solution to be of the form of a hypergeometric series in 


either x, z = I, or 1/x, respectively, possibly multiplied by some power 
of x, z = 1, or 1/x. This was one reason for Kummer to look for 


appropriate changes of variables in the study of the hypergeometric 
equation. Another was the need to find hypergeometric series that 
yield independent solutions of the hypergeometric equation other than 
the canonical hypergeometric series, because the differential equation 
has a basis of two independent solutions. 

Kummer investigated what happens to the hypergeometric 
equation under changes of variable, and found that the only changes of 
variable that can be made (unless there are special relations between 
the coefficients a, 6, y) are the ones that Gauss considered— including 


the identity transformation z = | and the one Gauss had not written 
down, coskctcosks. 


The details of his argument allowed him to deduce more. For 
example, if @ is replaced by tf + dt and n is replaced by n = | this 


produces a new solution: 
y= (1-2) PF -0,y- 8, 7,3). 
Other similar changes produce the solutions 
y=x' Fly —@,y—B,7,%), 
and 
y = x'7(1 — x) PF(1 - o, 1- B,2-y,2). 


These solutions can also be checked directly, by long but 
straightforward calculations. 

In this way, he was led to his finest achievement in this paper 
(Kummer 1836 [167], 52-53): an enumeration of a family of 24 
solutions to the hypergeometric equation that between them form 
what can be considered as the complete solution to the equation (see 
Fig. HZ) In Kummer’s work, the variable is real, and he regarded the 
24 solutions as the best way to represent solutions valid near the 


singular points in a variety of convenient forms. Later writers, 
Riemann and Schwarz, allowed the variable to be complex, in which 
case Kummer’s 24 solutions provide not only sets of bases for the 
solutions everywhere, but their inter-relations (which he also 
described) give a description of their analytic continuation on the 
complex sphere. However, this interpretation was almost completely 
lacking in Kummer’s work (despite the remarks of some later 
mathematicians and historians such as Klein ([158], 267) and 
Biermann ([16], 523)).° 

We could check these solutions by introducing a new variable, say 
z = x/(x — 1), writing the hypergeometric equation in terms of z, and 


plugging in the new solution. If we do so, it is important to note that the 
transformed equation may no longer look like the hypergeometric 
equation. 

As you can see in the table he provided, Kummer expressed the 
solutions of the hypergeometric equation in terms of a hypergeometric 
series possibly multiplied by a power of x or z = |, where the variable 


in the hypergeometric series is one of 
1 1 x 6 coal | 


x, l-x, -, : , or 
x l-x x-l x 


The various solutions are valid on several different domains of 
continuity, which for Kummer were intervals on the real line separated 
by the points x = 0 and x = O and which Gauss would have 


understood as discs and half-planes in the complex plane. 
If we let x be a complex variable, then it is easy to see that, because 
ad — bc # 0 converges for |x| < 1, which is the disc Do centre 0 and 


radius 1, the series in the variables 


converge, respectively, in the domains 


° {x : 0 < x}, which is the disc Do disc centre 1 and radius 1; 


e outside the disc Do; 
e outside the disc Do; 


° in the half-plane of complex numbers with real part less than 2; 
j 


° in the half-plane of complex numbers with real part greater than 
2, 
j 


Because some of these domains overlap, Kummer also established 
the linear relations that exist between any three of the 24 solutions that 
converge on a common neighbourhood. Some are simple equalities, for 
example, between expressions (1), (2), (17), and (18) - (see [167], 54- 
55) 


F(a,B,y.x) = (1 — x)” °? Fy - a, y - By, x) 


=(l—2) "Fayy 6; 7,211). 
In fact, there are six different families of four equal solutions thus: 
1,2, 17, 18:. 3,4, 19,20: 5,6; 21,22: 7,8, 23,24; 9,12, 13,15: and 10,11, 14, 16. 


So, to find all the linear relations between the 24 solutions is enough to 
consider the six different ones 1, 3, 5, 7, 13, 14. Of these, 5 and 7 
converge or diverge exactly when 13 and 14 diverge or converge, 
respectively (Kummer here restricted x to be real, but the observation 
is valid for complex x). The problem is thus reduced to finding the 
relations between the following triples: 


Ide eos he 13e 13.1, 34s Was 7, 


and after some work he listed the relationship that arise. They all arise 
from evaluating F(—,—-,-, x) at x = Oor x= 0. 


What Kummer’s solutions also show, on letting x be complex, is that 
a solution of the hypergeometric equation is analytic everywhere 
except at the three points 0, |, co and that in a neighbourhood of any of 


those points the solution becomes analytic on being multiplied by a 


suitable power of x or z = |. In the language of later writers, this says 


that the hypergeometric equation is an ordinary differential equation 
with three regular singular points. 
We shall see in Chap. 14 how Riemann deepened that insight. 


11.5 The Method of Undetermined 
Coefficients 
We can solve the hypergeometric equation 


(x — x°)y"(x) + (c- (a+b + Dxy'(~) —- aby = 0 


in a neighbourhood of the origin by the method of undetermined 
coefficients, and in this way realise that the solutions of the 
hypergeometric equation that are displayed in Fig. 11.2 are correct for 
various changes of variable. 


16) 


92) 
93) 
24) 


F(a, B, Y x), 
(1— x)" F(y¥—a, y—B x), 

a7 Fa—y +t, B—y +1, 2—7, =), 

a4 (1— a8 F(1—a, 1—B, 2—, z), 
F(a, B, a+ B—y-+1, 12), 
at F(a—y+1,B—y+1,a+8—y+1, 1-2), 
(l—axyF Fy —a, y—B, y—4—B +1, L—2), 
x7 (1L—ax)r-* Fi—a, 1— PB, y—a—P-+1, 1— x), 
«F(a, a—y+1,a—P +41, =), 
a? ¥(B, B—y+1, @—a-+1, +), 
a1 (1—w)-*-# F(t—a, y—a, RB—a-+l, =), 
w?-¥(1—ayr-*-# F(1—B, y—B,a—B+1, £), 
(i—ax)* F(a, y—B, a—B-+1, -—), 
(1—2)* F(8, y—a, B—a+1, 7), 
w-1(1—a)- F(a—y-+1, 1—2, a—B+1, 7+), 
a1(1—ay-P1 F(8—y +1, 1a, B—a +1, ~_), 


(ia) F(a, y—B, %» ==); 


. (i—x)? F(6, ¥— a; Ys =), 


a'1(1—ay F(a—y-+4,4—8, 2—y, —), 
a4(1—a)-? F(B—y $1, 1—a, 2—, -), 
2 F (a, a—y-+1, a+6—y+1, =), 


2? F(B, B—y +1, 0-+8—y +1, =), 
n1(1—ay* F(1—a, y—a, y—a—B-++1, ==), 
2-112)? F(1—B, y—B, y—a—B 41, ==), 


Fig. 11.2. Kummer’s 24 solutions, from Kummer ([167], 52-53) 


To do this, we write 


y = (ag + ax tagx +--+ a,x’ +---) = x" f(x»), do #0; 
This gives 
Gh = pel x2 vee me k Peis 56% mae 
y (x) = kx" (dg + A,X + nx Hee Hag X 40+) FX (a, FAX H+ +++ + Aya x" +--+), 


VD) = Ke Di ag Hatha 4 tat) 42 Gq i tae eo) 


aay -.bagx eee ME Det Dax +e: 
SO 
(x — x°)y""(x) + (c — (a+ b + 1)x)y'(x) — aby = k(k — 1x"! + ck! + higherterms, 
sO 
k(k — 1) + ck = 0, 
and so 
k=0. ~or° k= 1c. 
In the case when u,/Uy, we find by looking at the coefficient of x that 
—abay + ca, = 0, s0 ab. Likewise, on tidying up, we find that 


a, = —dao 
Cc 


ae DO) 
~~ (c+) 


and by looking carefully at the coefficient of a, that 


(-ab —-n(a+b+1)-—n(n—- 1))a, + (n + 1I)(e + nN)ayny, = 0. 
So 
(ab+nat+nb+n’)a, = (n+ 1)(c + N)ay+1, 


and so 


_ (a+nybt+n) 
el Ce ieea). 


So, in this case, the solution of the hypergeometric equation is the 
hypergeometric series 


ab a(a+ 1)b(b+1)\ , 
Bae a ge mane ee a ces 
yo) an "Te" T2ee+b | * 


It is an interesting exercise to determine the domains of convergence of 
these series, which relates to the changes of variable used by Gauss, 
Kummer, and later Riemann. The contiguous equations mentioned by 
Gauss can be used to find a second, linearly independent solution to the 
hypergeometric equation and convergent on the same domain, as 
Gauss indicated. 


11.6 Exercises 


1; 
Derive Eq. (11.2) 


@F' 1-3c?dF! |, 


_ -*\___ = 
dd a Ic | a0 


and (11.3) 
ae. Vcd’ 


aes ce 
qa ae +E =0. 


(1 -c’) 


Calculate the arithmetico-geometric means of several pairs of 
numbers, including this one that Gauss did, the arithmetico- 
geometric mean of land 1/9. 


Questions 
1. Find out what Gauss found significant about the arithmetico- 
geometric mean of land +/9 (itis nr 98 in his mathematical 


diary). 

Zz. 
Why do you think Gauss published his investigations into the 
hypergeometric series but not into the hypergeometric equation? 
What other subjects did Gauss investigate but not publish? 


Footnotes 


1 Fora much fuller account, see Gray [124] and Bottazzini and Gray [22]. 


2 The first paper, E252, was presented to the Berlin Academy in January 1752, and the second, 
E251, was presented to the St. Petersburg Academy in April 1753. Both were published for the 
first time in 1761, which says something about the turbulent conditions of the time. 


3 It is sometimes said that Euler studied the first of these in 1750, supposedly in the paper 
E154, which is indeed about the rectification of the ellipse. But this is not quite true; there Euler 


studied the similar but different differential equation Gi py £4 _ Lp? dq 


dp? p rae 


4 In the usual notation from Legendre ([184], Vol. II), II(z) = I'(z + 1). 


5 See also his Collected Papers Vol. II, 88, 89. 


6 Kummer concluded the paper with an unremarkable study of what happens when x is 
allowed to be complex but a, n, and J stay real that has no bearing on the issue of analytic 


continuation. 
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12.1 Introduction 


It was surely inevitable that the confident methods of the eighteenth 
century for solving ordinary differential equations, which were more 
formal than rigorous, would be critically examined by Augustin Louis 
Cauchy (Fig. 12.1), and they were. Curiously, however, although he 


appreciated the difference between real and complex methods he chose 


to advertise his complex power series methods and seemingly forgot 


that he had been the first to give a good account of ordinary differential 


equations in line with the insights of his own x + y analysis. 


Cauchy also built on the work of Monge and Lagrange to provide 


existence theorems for partial differential equations, but in this case his 
use of real analysis for the first-order case and analytic functions for the 


general case reflects a fundamental difference in what could be 
established. This is discussed in Chap. 17. 


12.2 Cauchy and Ordinary Differential 
Equations 


Cauchy’s interest in analysis was not confined to providing improved 
foundations for the subject. He applied it in many domains of 
mathematics and in so doing greatly extended it. As one example of 
this, we discuss his work on the topic of showing that ordinary 
differential equations have solutions. 

He tackled this question twice in his life. The first time was as early 
as 1820 or 1821, when he was lecturing at the Ecole Polytechnique, and 
the second some 15 years later, when he was in self-imposed exile in 
Prague. As we shall see, for many years the second method eclipsed the 
first, for reasons that tell us a lot about the mathematical community of 
the time and its priorities. 
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Fig. 12.1 Augustin Louis Cauchy (1789-1856), artist unknown, after J. Rollet, 1840 


In 1820, advanced mathematics students learned how to solve 
differential equations from Lacroix’s Traité du calcul différentiel et du 
calcul intégral. This was a deliberate ragbag of methods. Cauchy broke 
with this approach, and with the views of the Directorship of the Ecole, 
by tacking a question that Lacroix had ignored: Does a first-order 
ordinary differential equation have a solution? 

It might seem surprising that Lacroix had not dealt with this 
question himself, but there are several reasons why he did not. Insofar 
as differential equations arise in mathematical physics, it can seem 
obvious that they have solutions—the only problem is how to find 
them. Lacroix was not much interested in rigour, and it is always 
possible to check that a suggested solution is correct. Even the 
theoretical formulation of a differential equation, as a curve about 
which some information about the behaviour of the tangents is given 
(or a function about which some information about the behaviour of its 
derivatives is given) seems to imply that there is a function that has the 
prescribed property. 

But rigour in these matters greatly concerned Cauchy. Christian 
Gilain, the modern editor of Cauchy’s work in the 1820s on the topic, 
commented that Cauchy’s existence theorem is not a discovery inserted 
in an otherwise classical text, but ([39], xxi) 


it has a very important place not only in the great number of 
pages devoted to the proof and its examples, but in its role in the 
whole organisation of the course. It is truly a work in the 
foundations in the general theory of differential equations ... 


The lecture notes of a course on ordinary differential equations that 
Cauchy gave were lost, and were not republished in the 31 volumes of 
his Oeuvres. However, a set of 13 lectures, in the original edition printed 
by the Ecole Polytechnique, was found by the Gilain in the Bibliotheque 
Nationale in Paris and published in 1981. Unfortunately, it is apparently 
impossible to date them precisely; Cauchy seems to have had the idea 
by 1820 or 1821, and probably gave the lectures in 1823 or 1824. 

In his opening five lectures, Cauchy largely followed tradition and 
dealt with explicit solution methods. Lecture 6 prepared the way for 


Lecture 7, in which Cauchy observed that explicit solutions to a 
differential equation of the form 


dy _ 
7. f(x,y) (12.1) 


yield a family of solutions, so one could also find the solution that took 
a particular value v; ata particular point Xo. 


Cauchy now proposed that the important problem was to establish 
for every differential equation the existence of a solution that took a 
particular value v; ata particular point xg. The existence of a general 


solution would then follow by allowing xo and v, to vary. This is nota 


pedantic point. Lacroix, like others before him, had regarded a solution 
as a formula and the methods for finding a solution were largely formal 
—that is what was meant by having a general solution. Cauchy inverted 
the process, and asked if there is any solution at all that would have the 
required properties in the light of his revised standards for analysis as a 
whole. 

In Lecture 7, he sketched a method for proving the existence of 
solutions to the differential equation 


dy _ 
Ay at fey): 


For the first time, we get restrictions on the function /—without which 
one suspects nothing can be proved. He showed that 


Theorem 12.1 If the function fis continuous and bounded by AB as 
a function of x and y on a neighbourhood of the point (Xo, yo) in the 


plane, and the partial derivative Of is also continuous and bounded in 


Oy 
that neighbourhood, then there is a solution to the equation. 


He argued the solution function is very close to the collection of points 
(xj, yj) given by 


Var HOH Ba ay) 
when the points v, are close together. This is what you would expect, 
because when x; — x;-; is small the quotient 
Yj — Yj-l 
Xj — Xj-1 


should be very close to the value of — and so to the function fat the 
x 


point (xj-1, yj-1). Then the stated conditions imply, by a limiting 
argument, that as the points v, get closer and closer together, the 


points (x;, y;) lie on a curve that is the graph of the integral {I y)dx 


, which is the solution of the differential equation and for which 
Z = 2(x, y), so the theorem is proved. 


To be more precise, let us follow Cauchy and fix some notation. In 
Lecture 7, Cauchy considered the ordinary differential equation 

dy _ 

dx 


for x € [xo, X] on the assumptions that fis continuous and bounded, 


i Oey) 


[f(x, y)| < A, and y; is continuous and bounded, |f,(x, y)| < C. He 
supposed the initial value of y was v; andhe set X — xo = H. 
He took a sequence of points xp) < xj1,<...< x, = X and defined a 


sequence of points yo, y1,.--,¥, = Y by the equations 
yi — Yo = (X%1 — Xo) f (Xo, Yo), 


Ver Ve = (Sa) iV, 


Y = yp-1 = (X = Xp) fn Yn-1)- 
He then took it for granted (validly) that 
Y SPO Gy Vien it ont pe XG V0) 
is a continuous function of all its variables, and proved that as a result 
Y = yo + (x — x0) f (Xo + OH, yo + OAA) (12.2) 
forsome 0 < 6 < 1, and gdx + pdz. (He did not use the continuity of 
y; here.) 


The proof has three small steps. First, summing up the equations 
above gives 4k + | on the left-hand side, and on the right-hand side a 


term that can be rewritten as X — xp times some average value of the 
various f(xj-1,yj-1) that Cauchy wrote as S, by what we would call 
the intermediate value theorem, for some 0 < 6 < 1. 

Second, similarly, for every j, 

lyj —yp-il S (X—X)A = HA 

and so may be replaced by an average, déy = ody, for some 0 < 6 < 1. 

Third, therefore, the various values of the f(xj;-1,yj;-1) are all of the 
form f(xo + 6H, yo + OAH). A further averaging argument (or, we 


would say, use of the intermediate value theorem) gave the required 
result. 
Cauchy then investigated the dependence of Y on the initial value v, 


, and now he needed the conditions on y;. He proved the theorem that 


if the initial point is yj), where yj — yo = Bo, then 


IY(%) — YOo)| = OlBole” 


forsome @, gdx + pdz. This says that the solution (if it exists) is 


unique. 
The proof began by looking at y, and defining a; by the equation 


Yo — Yo = Bo. This gives 


Bil — Bol = (1 — X0)(f (x0, yo) — (Xo, Yo)). 


The term in fcan be written (again by the intermediate value theorem) 
as 


Fy(X0. Yo + P00 — Yo)) 0% — Yo) = Bolfy(Xo. Yo + G05 — Yo)), 
forsome GM + MH so 


Bil = Bol + G1 — x0) f(X0, Yo + F0(% — Yo)). 
By a further use of the intermediate value theorem this becomes 
yan YO VHB, 
for (yet another) v(x, y) = Vo. A similar argument applies at every step. 
To combine the equations that result, Cauchy noted that 
Osc Haye 1 Se CH 
Therefore 


IBnl < [Bole F170 O27) 22 eC On-Fn-D) = Bole % = Bole”, 


and the result follows. 
Cauchy then checked that the theorems are true independently of 
the choice of the vj, and could then state the theorem that if y = F(x) 


is the limiting value of the function Y = F'(xo, X1,...,Xj,..-Xn-1,X; Yo) 


as n increases indefinitely and the distances x; — xj; decrease 


indefinitely, then a, y, z, f satisfies the differential equation 


= = f(x,y) with the initial condition z = z(x,y). 


The proof is little more than using the intermediate value theorem 
to show that y is a continuous function of x, to bound the variation in y 
as a function of x, and to deduce that in the limit as the intervals 
between (otherwise arbitrary) x-values tend to zero the function yis a 
differentiable function of x and — = f(x,y) in the region considered. 


In Lecture 8, he strengthened the theorem to show that the stated 
conditions on fand a imply that there is a solution to the differential 
y 
equation on some interval (xj), yj;-1). He showed that the differential 
equation has a solution for values of x between Xo and z, = 0 and 
values of y between yo — Ad and yo — Aa, where ais a quantity 


determined by the function fand A is greater than the absolute value of 
f(x, y) in the interval considered (so A depends on a). Curiously, 
Cauchy did not prove that the constant a always has a non-zero value 
but only assumed it. 

As in Lecture 7, Cauchy took a sequence of points Xo, X1,...,X, and 


the corresponding points z,, = P(x, y) such that 
Visi — Vj = jer — XP) Fp), 
and for all j, |yj+1 — y;| < Aa. Adding these equations gave him 
m-1 


ine Oe > Oye al KT (x; V3) 
j=0 


and the term on the right-hand side is equal to X = X(x,y), where f* 


is an “average” of the a, y, z, fs. By assumption, for every j, 


Fav <A; 0S 7a m=4, 
and so their average is also less than A. Therefore, if 
yo-AasyjSyotAa, Os jsm-l 
then also yo — Aad < ym < yo, and so each yj; is such that |y; — yo| < Aa. 
It follows that for x) < x < x9 +a and yo — Aa < y < yo + Aa the 


functions f(x, y) and @ + da@ are continuous and bounded; and 


If@ y)] SA. 
Therefore, the limit as the v,; get closer and closer together exists 


AN. ten et Ee is a solution of the differential equation 
y-yo=ulf, fo.yax q 


that satisfies the given initial conditions. 

Cauchy did not stop with this local result. He sought conditions 
which would allow him to prolong the solutions out of the 
neighbourhood in which they have been shown to exist. He found that 
this could only be done indefinitely if certain necessary conditions were 
met, and gave examples to show that some differential equations have 
solutions that only exist for a limited range of the variable x. We can 
note this simple one: the differential equation dy _ pase for which 


dx 
x = O implies y = 0, has the solution 6 = y(q), but this is only defined 
on the interval (—77/2, 7/2). 


We should also note that these theorems give sufficient conditions 
for a solution to exist, but they are not necessary. 
Before we proceed, it will help to look at two examples. We start 
with the simple case 
dy x 


dx y 


with the initial conditions that at x = 0 we have y = 0. 


Suppose we use the theorems to look for a solution for which when 
x = 0 we have y = (0). The theorems tell us that there is a unique 


solution valid in some neighbourhood of the point (x, y) = (a, D). 


But if we use the theorems to look for a solution for which when 
x = 0 we have y = 0 then the theorems tell us nothing, because in any 


neighbourhood of the point (1, 0) = is unbounded. 


x 

y 
These results correspond to the fact that the solution of the 

differential equation in each case is x* + y* = J, and in the first case 


the initial conditions imply that the unique solution in this case is 
y=(1- x’)!/*, whereas there is no single-valued solution to the 


differential equation in any neighbourhood of the point (1, 0). This also 
brings up the important point that for Cauchy a solution to a 
differential equation may be given only implicitly, in the form 

f(x, y) = 0, and not explicitly (as a, y, z, f). 


Now consider the first-order ordinary differential equation 
dy 1 
ae way 


It looks as if something could go “wrong” when x + y = OQ; let us see 


what happens. 
We can solve this equation explicitly by writing 


x+y=z, sodx+dy = dz, 


when the equation becomes 
dz 1 


a te 


dx Z 


or 


We separate the variables and it becomes 


1l+z-1 d 
So 
1+2z 1+2z 


dx 


SO 
Sixt SytSy = 0. 


where a is an arbitrary positive constant. This equation, in the original 
variables, says 


x=x+y-loga(l+x+y), 
or 
y=logai+x+y), 
or 
e =a(l+x+y). 
This says that when x + y = 0 that y = loga, which, unexpectedly, is a 
constant. What has happened is illustrated in Fig. 12.2, where x = 0. It 


is clear that y cannot be a single-valued function of xina 
neighbourhood of the point where —x = y = log 2. 


“2 
Fig. 12.2 The graphof eY = 2(1+x+y) 


The figure confirms what the “infinite” value of dy already said: the 


dx 


implicitly defined function that is the solution of the differential 
equation cannot be written as a function y of x in a neighbourhood of 
the “bad” point. 

It might seem that Cauchy paid a high price for rigour. Whereas 
earlier mathematicians offered general formulae, he could only provide 
solutions in a neighbourhood of an initial point where the differential 
equation was well behaved. This is, of course, entirely of a piece with 
his theory of functions, in which, for example, power series may 
converge only for a limited range of the variable. 

These lectures by Cauchy provide the first proof of the existence of 
solutions (locally, at least) to a first-order differential equation. The 
historian of mathematics has a particular interest in them because they 


illuminate the frailty of our knowledge of even the recent past. Unlike 
much of Cauchy’s work, they are not contained in the 31 volumes of his 
Oeuvres completes. They are not even listed by the editors of those 
works as being among the texts omitted from the collected works. It 
was known that they had existed only because Cauchy mentioned them 
in a resumé of 1835, and because his friend the Abbé Moigno gave an 
account of them in his book of 1841. Only in 1979 did Gilain track down 
a printed copy of the first 13 lectures in the archives of the Institut de 
France. 

The disappearance of these lectures is hard to explain, because 
Cauchy was an energetic publisher and republisher of his own results. 
A crucial factor is likely to have been Cauchy’s discovery of a different 
proof in 1835, one he did publish (twice) and which did catch on. 

By then, Cauchy was in Prague, officially as a tutor to the Bourbon 
Dauphin (who, unsurprisingly, had no interest in learning what 
Cauchy had to say). In the introduction to his Prague paper, 

Cauchy stated that the integration of differential equations by series 
was “illusory, so long as one did not provide any means of assuring that 
the series so obtained were convergent”.! This emphasis on series 
solutions is a sign of an important departure from this earlier 
presentation, although he did claim that his new method shared the 
advantages of the earlier one. But now he worked from the start with 
systems of first-order ordinary differential equations satisfying given 
initial conditions. Following what he called Hamilton’s “wonderful 
paper” of 1834 on the differential equations of dynamics (see Sect. 24. 
3), he reduced the system of ordinary differential equations to a single 
first-order linear partial differential equation. He then showed how to 
find enough particular solutions of the partial differential equation to 
find the general solution of the original ordinary differential equation. 
He then used his rigorous methods of analysis to show that the 
particular integrals could be expanded in convergent power series. 

What is curious about this paper is that the theoretical passages 
treat the independent variable x as a complex variable, but in the 
examples he gave to illustrate the theory the variable x is regarded as 
real. In their comments on this, the historians Bottazzini and Gray 
remark that?: 


Apparently, even as late as this Cauchy did not seem to recognize 
the deep difference between the real and the complex case of his 
existence theorems, and the ambiguity between the real and 
complex ran through his entire paper, so much so that in his 
concluding remark he proudly stated that his new theorems 
could “easily” be extended to the solution of differential 
equations in which the variables and functions involved become 
imaginary. Precisely for this reason he preferred his second 
existence theorem based on the calculus of limits, which had 
transformed the integration of differential equations into a 
rigorous theory. 


So in this paper, and indeed in a number of papers he went on to write 
that drew on this paper, the condition on the functions that enter the 
differential equation is that they are complex analytic. This was to turn 
out to be a very much stricter condition than being infinitely 
differentiable, and very much stronger than the modest conditions on 
differential equations of a real variable that Cauchy had assumed in 
1821. The only reasons Cauchy can have preferred his Prague paper to 
his earlier account are that either he did not appreciate the difference 
in the conditions, or he thought that the more interesting case was 
functions of a complex variable anyway. The former reason is plausible 
for the period. The latter one is also plausible and it fits to the 
intermittent but deep interest Cauchy had in establishing a rigorous 
theory of functions of a complex variable. 


12.2.1 Later Developments 


Cauchy’s early emphasis on the local theory of real ordinary differential 
equations was taken up and polished by Rudolf Lipschitz and Emile 
Picard to give a sound account, using different methods, of the 
existence of solutions to ordinary differential equations. For a brief look 
of the creation of the modern theory of real ordinary differential 
equations, see Appendix G. Picard’s account is the final part of his study 
[211] of partial differential equations, which we shall look at in 

Chap. 28. 


12.3 Exercises 


if 
Show that Cauchy’s theorems 7 and 8 do not apply to the 
differential equation y’ = y!/?. What happens in this case? 


Z. 
Show that Cauchy’s theorems 7 and 8 do not apply to the 


differential equation y’ = *. What happens in this case? 


Questions 


‘e 
Cauchy gave these lectures around the time he gave the famous 


courses in which his new approach to analysis was put on record 
for the first time. Which of the ideas introduced in theorems 7 and 
8 do you think need further investigation before they could be said 
to be rigorous? 


Ask yourself this question again when you have seen Cauchy’s 
lectures from 1819 on first-order partial differential equations in 
Chap. 17. 


Footnotes 
1 See Cauchy ([35], 400). 


2 See Bottazzini and Gray ([22], 162). 
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13.1 Introduction 


Complex function theory was relatively new in the mid-nineteenth 
century.! After decades of intermittent interest, Cauchy began to draw 
his insights together in the late 1840s so that a younger generation 
could appreciate them, and more-or-less independently 

Riemann (Fig. 13.1) began to develop his theory, starting in 1851. This 
chapter looks specifically at Riemann’s approach to the subject, which 
greatly transformed it through a balance of intuitive and often 
geometrical ideas and a profound connection to the theory of harmonic 
functions, although at a cost in lack of rigour that some were to feel was 
too high to pay. 


13.2 Complex Function Theory 


In this section, some of the more important results will be indicated. 
The more intuitive ones will be explained; others, which require a more 
careful explanation, are standard in any book on complex function 
theory, and were given various accounts in the nineteenth century. 


Fig. 13.1 Bernhard Riemann (1826-1866) c. 1863. Riemann, Gesammelte mathematische 
Werke, 3rd. edn. 1990 


Riemann had begun his [234] with an insight that it had taken 
Cauchy years to get clear: a function fof two real variables x and y that 
is a complex-valued function of a single complex variable z = x + iy is 


best understood (and had always been informally understood) to be a 
function on a domain in the complex plane on which it is complex 
differentiable. That is, if and only if the limit of the quotient 


forays @) 
dz 
exists as dz tends to zero and does not depend on the direction of dz— 
or, to put that point another way, if it does not depend on the 


components dx and dy. 
By letting dz = dx and then letting kf = Nz and equating the 


results, we obtain the partial differential equation 


Of . OF. 
a a 


from which, on writing f(x, y) = u(x, y) + iv(X, y), the familiar Cauchy- 
Riemann equations are obtained: 


F(x, y,Z, a,b) = 0, 


They form a system of two coupled first-order partial differential 
equations, and it is interesting to note that although Riemann was 
willing to base his new theory of complex analytic functions on them 
Weierstrass later was not—an indication of Weierstrass’s preference 
for power series, to be sure, but also an indication of how opinions 
stood on the subject of partial differential equations in the mid- to late 
nineteenth century.” 

Riemann appreciated, but Cauchy did not, that these equations 
imply that at points where the derivative of fdoes not vanish fare 
conformal (angle preserving), because the differential can then be 


written as 
u, u uy, u cost —sint 
Se oe Pea 
Vx Vy —Uy Ux sint cost 


for a suitable function g(z) and a parameter t. 
Riemann also noted that if you differentiate these equations you 
obtain the equations 


so u and v, the real and imaginary parts of f, are harmonic functions. He 
might well have learned of the conformal nature of complex analytic 
functions from Gauss, who had observed this fact in 1822, and he surely 
learned to appreciate harmonic functions from his mentor Dirichlet, so 
much so that much of Riemann’s theory of analytic functions is derived 
from a study of harmonic functions. 

It follows from the work of Cauchy that a function that is complex 
differentiable is infinitely differentiable and can be written as a power 
series in a neighbourhood of a point where it takes a finite value.° 


Riemann knew this, but generally preferred to treat complex functions 
geometrically. 


13.3 The Riemann Mapping Theorem 


Riemann believed that the Dirichlet problem (see Sect. 18.3) hada 
solution. In other words, given a simply connected region T anda 
continuous function defined on the boundary of T, the function has a 
unique continuous extension to a harmonic function u defined on the 
interior of T. It followed that there was a unique complex differentiable 
function x + iy once the conjugate function was specified at a point.‘ 


He gave a proof of this claim that was inadequate, because the 
theoretical understanding of the Dirichlet problem was too poor, and 
the claim divided mathematicians for a generation. Some found it 
almost certainly true and worth using until a rigorous proof came 
along, and at the other extreme some found it beyond hope. (It was 
eventually established under very general conditions.) 

Riemann used it as the basis of an argument for a much stronger 
and deeper result that became called the Riemann mapping theorem’: 
Theorem 13.2 The Riemann mapping theorem: Any two simply 
connected regions are not only topologically equivalent but analytically 
equivalent. 


(Riemann assumed that such regions are bounded by curves 
homeomorphic to circles, which need not be the case, but topology was 
being pulled into existence by this and other papers of his.) His proof 
that any two such regions, with boundaries as described, are in fact 
topologically equivalent was already novel. That they are analytically 
equivalent means that there is essentially only one simply connected 
domain for the purposes of complex function theory, and that can be 
taken as the unit disc. 

This was immediately seen to be a powerful result, but Riemann’s 
argument for it (in Sect. 21) was naive, and attracted repeated attempts 
to prove it, some of which we shall meet below. 


Fig. 13.2 The initial stages of the mapping theorem 


His argument, somewhat over-simplified, went as follows (see 
Fig. 13.2). Pick a point y; in the interior of T, and let © be a small disc 


centre y; that lies entirely in T. Consider the function log(z — z) on © 
cut along a radius , so that the branch of the log function jumps by 
—27i as it crosses € clockwise. Extend ¢ to a curve (also called &) that 


does not cross itself and reaches to the boundary of T. Extend the 
function to a continuous complex function f(z) on the whole of T that is 
purely imaginary on the boundary of T, and likewise jumps by —27i as 


it crosses ¢. Therefore, the imaginary part of this function goes from 0 
to tdt as z completes a circuit of the boundary of T clockwise. Show 


that the integral of f(z) around the boundary of @ is zero, and over all 


of T is finite. Deduce that it is possible to define what is called a Green’s 
function (see Sect. 18.2): a function g(z) that is infinite at y; and has 


zero real part on the boundary of T. 


Fig. 13.3 An indication of the final stages of the mapping theorem 


On €¢ the value of the real part of A(z) = f(z) + g(z) goes from —oo 
at y; to 0 where ¢ meets the boundary of T. Let —oo < a < 0, and look 
at C, = {z € T|Re(h(z)) = a} (see Fig. 13.3). Riemann claimed that this 


can only be a single loop that does not cross itself, because T is simply 
connected. So T is filled out by these loops, which are the level curves of 
the real part of h(z), and they do not intersect each other. The function 
e@) maps the boundary of T, which is the outermost of these loops, 


onto the unit circle, and all the other loops onto concentric circles; the 
point y; is mapped to the centre of the circle. 


Therefore, T is analytically equivalent to the unit disc, and therefore 
any two simply connected regions are analytically equivalent. 


13.4 A Look Ahead 


Chapter 14 discusses how Riemann tackled the hypergeometric 
equation. His analysis focussed on the fact that the coefficients of the 
equation are infinite at precisely three points: z = 0, 1, co, and 


therefore very likely the solutions, which will otherwise be finite, will 
be infinite at those points. In fact, as Gauss had shown, the solutions 
take the form of a power of the variable z times a convergent power 
series, or a power of | — z times a convergent power series. So 


Riemann, who knew Gauss’s papers [114, 115] very well, could see in 
advance that the solutions were likely to be branched at z = 0, 1, co and 


would usually be what were called in the nineteenth century many- 
valued functions. 

Riemann could also see, as Kummer had done, that Gauss was 
hinting that the transformations z — 1/zand z = x + ly were 


particularly relevant, and this is connected to what are called Mobius 
transformations, which have the special property of mapping triangles 
whose sides are arcs of circles to other circular-arc triangles in an 
angle-preserving way. 

It was also becoming clear to mathematicians by 1850 that although 
a power series is convergent only inside some circular disc (which 
might be the whole plane of complex numbers) the complex analytic 
function it defines might be defined on a much larger region (think of 
l+zt+z7-+-:- and (_- zy '). For Weierstrass, in particular, this 


invited the question of how a complex function could be defined ona 
family of overlapping discs, and this led to a theory of analytic 
continuation. 


Finally (for now), it was discovered that if a (non-constant) complex 
function is defined everywhere including © then it would have to take 
the value ©oO somewhere (this can be handled with elementary limiting 
arguments). For example, the simplest functions on the complex plane 
with a point at infinity are of the form 


az+b 
Sera ‘ 
cz+d 


where a, b, c,d are constants and ad — bc # 0 (the Mobius 
transformations). They take the value 1/c when N = | and the value 
co when z = —d/c. Discovery of this fact about complex functions is 


disputed between Hermite and Liouville, to whom it more properly 
belongs, and Cauchy? It was also known to Riemann, who may perhaps 
have discovered it for himself. 

For convenience, a brief account of each of these topics will be 
found in Appendix E. 

The results we need about Mobius transformations are simple, and 
concern their effect on straight lines and circles. As shown in 
Appendix F, a Mobius transformation maps a straight line either to a 
straight line or a circle, and it also maps a circle to either a straight line 
or a circle. Informally in this context mathematicians used to think of a 
straight line as a circle passing through oo. With that convention in 
place, we can say that a Mobius transformation which maps a circle 
through three given points to three other given points maps the circle 
through the first triple of points to the circle through the second triple. 


13.5 Exercises 

1. 
Describe the image of the upper half-plane under the two-valued 
“map” z +> z!/2. Extend your account to describe the image of the 


whole plane. 
2. Find all the Mobius transformations mapping the upper half-plane 
to itself. 


Find a Mobius transformation mapping the upper half-plane to the 
unit disc centred on the origin, and interpret it as an inversion. 


Hence, find all the Mobius transformations mapping the disc to 
itself. 


Questions 


1. 


One of the great divides in mathematics is between a real-valued 
function of a real variable being differentiable, and a complex- 
valued function of a complex variable being complex differentiable. 
Operationally, it is hidden behind the formalism, which looks like 
nothing more than switching x with z. For as long as it seemed 
natural to mathematicians to believe that functions were 
differentiable, even infinitely differentiable, this distinction was 
even harder to see. What signs have you seen of it already, whether 
appreciated or overlooked by the mathematicians you are 
considering? 


Footnotes 


1 See Bottazzini and Gray [22] for a comprehensive history. 


2 For more on this, see Bottazzini and Gray ([22], Chap. 6). The connection with Dirichlet’s 
principle is explored in Sect. 19.2. 


3 For the complicated history of this result, known today as Cauchy’s integral theorem, see 
Bottazzini and Gray [22]. 


4 See Riemann ([234], Sect. 19). 


5 Riemann only had in mind domains whose boundaries are topological circles; questions 
about boundaries were only investigated much later. 


6 See Ltitzen ([192], Chap. XIII). 
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14.1 Introduction 


Kummer’s 24 solutions to the hypergeometric equation in the form of a 
hypergeometric series in x, 1] — x,4+,1 -4, —, or ae possibly 

xX xX” X= =X 
multiplied by some powers of x and/or | — ap, presents each solution 


in a form that restricted it to a certain domain and then gave the 
relationship between overlapping solutions. In his [235], 

Riemann proposed to apply his new, geometric methods to study the 
hypergeometric equation as an equation for functions of a complex 
variable, noting correctly but dramatically that these methods were 
essentially applicable to all linear differential equations with algebraic 
coefficients. 

Like Gauss, Riemann took the independent variable to be complex, 
and a possibly unexpected result of doing this is that what look like two 
independent complex solutions of the hypergeometric equation 
generally appear as two branches of the same many-valued function. 
Indeed, a typical way that branched or many-valued functions arise is in 
the solution of ordinary differential equations. 


The importance of Riemann’s work is both general and specific: 
general in that it opened up the theory of ordinary differential 
equations to complex functions of a complex variable, and specific in 
that it showed that equations like the hypergeometric equation can be 
recovered completely from a knowledge of the behaviour of their 
solutions under analytic continuation. 


14.1.1 Ordinary Differential Equations and Many-Valued 
Functions 


First, we recall some elementary facts about the solution of the 
hypergeometric equation. We shall suppose that in the hypergeometric 
equation 


2 
xd - 0S +a +B + DNS - afw = 0 


dx 


the variable x is complex. The standard method for solving this 
equation, and any linear ordinary differential equation, ina 
neighbourhood of the origin is to substitute a power series of the form 


x"(dg aix tax ++ +) 


and to look at the coefficient of the lowest power of x. This will be an 
equation for J, and in this case the equation is 


A(A-1+y) = 0, 
SO 
A=Oorl-y. 


The recurrence relation for the coefficients then gives solutions in the 
form 


y = F(a, B,y, x) and x' F(a-y+1,B-y+1,2-y, x). 


These solutions form a basis of solutions for the differential equation 
in a neighbourhood of the origin. For use below let us call these 


solutions yo; and yo. 


14,1.1.1 Result 1 
We shall now show that, certain exceptional cases aside, the solutions 
of the hypergeometric equation in the complex case are all branches of 
the same complex function. 

By what we said earlier, if the variable x goes on a small circle 
around the point x = () the solution yo; returns to its original value, 


and the solution yo; returns multiplied by ¢27-7), This does not allow 


us to infer that the two solutions are branches of the same function, so 
we must look further afield. For convenience, and before we proceed, 
we write this information as a matrix equation (writing y for the value 


yor \_ {1 0 You 
es | = 0 e2ni(l-y) I es ) (14.1) 


For brevity, we write this as 


Yo = Doyo. 
Had we looked for solutions in a neighbourhood of x = 0 the same 
method would have produced solutions that are power series in z = | 
possibly multiplied by a power of z = |, and indeed we would have 
found that a basis of solutions in this neighbourhood is given by 
F(a,B,at+B-y+1,1—x) and (1 - x)” °*F(y-a,y-By-a-B+1,1-~%). 
We shall call these solutions yo; and yg). 


The radius of convergence of all the four power series we have 


written down is 1, and this means that in a neighbourhood of x = s, 


say, we have four solutions, and therefore two are linearly expressible 
in terms of the other two. Let us say that 


Yo. = 411911 + 412912 


Yor = 421¥11 + 422912, 
where the coefficients aj, are constants. 
We can write this information in matrix form as 
You a1, 12 iM 

(> | ~ ( az, a2? ee ) ae 

For brevity, we write this as 
Yo = AYi. 

We now propose to look at what happens to the solutions yo; and yo 
as x is taken on a small circle around the point x = 0. We can do this by 


watching what happens as yo; and yo; undergo the same journey. We 


know what they do individually, by analogy with Eq. 14.1: 


Yu\ (1 0 Yu 
ie | 7 ( ear? \ yi2 ) thes) 


Again, for brevity, we write this as 
Yo = Doyo. 


So conducting x around the point x = 0) returns yo; and yo, as 


I 0 ait a2 \{ you 
(| a te (a ay }\ yoo } oe 


which is 


D,AYo. 
But this expresses them in terms of yo; and yo;. To express the answer 


in terms of yo; and yo;, we must use the inverse of the transformation 


in Eq. (14.2). This inverse is given by the matrix 


_ | ar? — a\?2 
Ay! = ———____ 
411422 — a{202; \ —421 411 


and so the final result is that 
fo = A ' Dy Ayo. 


Written out in full, this says 


You 1 a2. —aj2\{ 1 0 (a a2 ea 
: = ———— = are (14.5 
& | 122 — A242) | —d2; aI \ 0 eB J\ ay, ar }\ yoo ( ) 


To conclude that yo; and yo; are branches of the same function, it is 


enough to show that the matrix 


a ay \) 1 O  \f an a2 

—an ay J\O &™O°?* }\ an an 
is not diagonal, and a routine calculation shows that it will be diagonal 
if and only if 


errity-a-B) = 1 
that is, ifand only if x = +7r/2 is an integer. 
A very similar calculation involving a small circuit around 4k + | 
can be carried out. In this case, the basis of solutions valid near 4k + | 


that we choose is 


Yo. =x "F(a,a-y+1,@—-8+1,1/%), Yoo = x °F(B.8-y+1,8-—a+1,1/x). 


Under analytic continuation around 4k + | these solutions return as 


—2ni8 


—2nia ~ 
=~ Yoo] ANd Poor = € Vai 


Yoo! 


which we can write as 


~ —2nia 
Yool |_| @ 0 Yol 
Gz )=(Co" e)(%2) 46 
So if we repeat the above calculation but extending a basis of solutions 
valid near 4k + | analytically around the point z = | we find that the 


corresponding matrix is diagonal, and the two solutions are not 
branches of the same function, if and only if a@ = fp. 


Likewise, if we extend a basis of solutions near x = 0 we find that 


the members of the basis are branches of the same function unless 
y= 1, 


We conclude that, these special cases aside, the members of a basis 
of solutions to the hypergeometric equation turn out, under analytic 
continuation, to be branches of the same function. 


14.1.2 The Riemann Sphere 


One of Riemann’s simple but productive innovations was to work with 
the plane of complex numbers augmented by a point at infinity. This 
gave him a sphere, and he regarded the connection between the plane 
and the sphere as being given by stereographic projection. This is a map 
from the sphere to a tangent plane at a point S (which we shall call the 
South Pole) that is defined as follows. Let N (the North Pole) be the 
point on the sphere diametrically opposite to S, and let P be any point 
on the sphere other than N. The map from the sphere to the plane maps 
Pto P’, the point where the line NP meets the plane (Fig. 14.1). 


Fig. 14.1 Stereographic projection of the Riemann sphere onto the plane, Anschauliche 
Geometrie, 2nd edition, Springer, 1996 


14,.1.2.1 Result 2 


We shall now see that analytically continuing a solution in a loop that 
winds once around all three branch points returns the original solution. 

Suppose that we have a function with exactly three branch points, 
the points z = a,b,c or, if you prefer, the points z = 0, 1, co. This 


means that as the function is continued analytically around the point 
z = 1 itreturns multiplied by a factor ¢27'@, as it is continued 


analytically around the point z = | it returns multiplied by a factor 


e-“i6, and as it is continued analytically around the point x + y it 
returns multiplied by a factor e277. 


Suppose it is continued analytically around the point z = | and then 
around the point z = 1. Itnowreturns multiplied by a factor 


e2tia e2niB — »2ni(a+B), By the deformation principle, you can think of this 


path as consisting of a loop starting at a point P (other than one of the 


branch points) that goes around the point z = 1, returns to P, and then 
goes around the point z = | and returns to P, or as a loop that goes 


both a and b. 
If it is continued analytically around the point z = 1, then around 


the point z = |, and then around the point x + y it returns multiplied 


by a factor 

erria e2rb eony - errilatp+y) 
But now something more can be said. For simplicity, suppose that we 
take a, b, and c to be points on the Riemann sphere. 

The following argument is clearest if we suppose that the three 
points are near the North Pole, and the path around them lies entirely 
south of them. Now we can imagine that the path is gradually deformed 
by moving it south until it lies arbitrarily close to the South Pole. It is 
now clear that continuing the function along this path returns it 
unchanged, and by the deformation principle stated above this means 
that the function returned unchanged along its original path. Therefore, 
x = +7/2 must be an integer. 


14.2 Riemann’s P-Functions 
Riemann began his [235] by specifying geometrically the functions he 
intended to study. Any such function P is to satisfy the following three 
properties: ! 
dle 
It has three distinct branch points at a, b, and c, but each branch is 
finite at all other points” 


A linear relation with constant coefficients exists between any 


three branches Pp’, p”, p’”’ of the function ¢’P’+¢"P’ +c" P. 


3. There are constants @ and qa’, called the exponents, associated 


with the branch point a, such that P can be written as a linear 


combination of two branches P® and p@’ neara, (z— a)°P” and 


(z- aye Pp are single-valued, and neither zero nor infinite at a. 


Similar conditions hold at b and c with constants 6, 6’ and y, y’, 


respectively. 


To eliminate troublesome special cases, Riemann further assumed 
that none of a,a’,6,f’,y or y’ are integers, and that the sum 


ata’+6+pP’+yty’ = 1. He denoted such a function of z 


a Ca 6 ae 
P| a@ B y z Jorsomething like P| a’ B’ Ge Joo 
a’ ion y’ ve 
when (a, b,c) = (0, 09, 1). 


In terms of the singular points at a, b, and c the first and third 
conditions say, for example, that P is branched like (x — a)*. The 


second condition says that there are at most two linearly independent 
determinations of the function under analytic continuation of the 
various separate branches. One of the results Riemann established is 
that, as with differential equations, this information specifies a P- 
function up to a constant multiple. 

The analogy between P-functions and hypergeometric functions 
becomes clear if we take x = 0, x = O, and v + dv. There are two 


linearly independent solutions of the hypergeometric equation at each 
singular point; they are branched according to certain expressions in a, 


n,and J; and any three solutions are linearly dependent. 


Riemann showed that information of this kind about the solutions 
determines the differential equation completely, which goes some way 


to explain the great significance of the equation, as well as its special 
character. 

It will be enough to record the results that he achieved. A more 
detailed look at how Riemann arrived at them is given in the next 
section of this chapter. 

Result 3: If P| is another P-function branched at the same three 


points a, b, c and with the same six exponents a, a’, 6, 6’, y, y’ then P 


is a constant multiple of P. 
Result 4 If Pand P, are two P-functions with the same branch 


points a, b,c and exponents the corresponding pairs of which differ by 
integers, then P, is obtained from P by multiplying by powers of 1 — z, 


1 — z,and z—c, the precise powers being determined by the 


differences in the exponents. 
Result 5 A P-function satisfies a differential equation and when 
II(z) = ['(z + 1) and the exponents of the P-function are suitably 


chosen the differential equation is precisely the hypergeometric 
equation. 

The conclusion is that, just liken algebraic function, a P-function is 
specified up to a constant multiple by the information about its 
branching. 

The most immediate and important difference between Riemann’s 
approach and that of his predecessors is the relative lack of 
computation. As he remarked, his new method “allows results that 
were formerly obtained in part only after somewhat troublesome 
calculations to be derived almost immediately from the definition”. 
Rather than starting from a hypergeometric series he began with a P- 
function having oo” branches and three branch points. To be sure, any 
two linearly independent branches have expansions as hypergeometric 
series, and any three branches are linearly dependent, but the 
argument employed by Riemann inverted that of Gauss and Kummer. 
His starting point was the set of solutions, functions which are shown 
to satisfy a certain type of equation. Their starting point was the 
equation from which a range of solutions are derived. Riemann showed 


that a very small amount of information, the six exponents at the 
branch points, entirely characterises the equation and defines the 
behaviour of the solutions. The hypergeometric equation is special in 
this respect, as Fuchs [110] was able to explain, and consequently the 
task of generalising the theory to cope with other differential equations 
was to be quite difficult. 


14.3 Riemann’s Arguments 


The analytic continuation of a P-function is determined by what 
happens at the branch points, because any closed path can be written 
as a product of loops around a, b, and c. When two linearly independent 
branches P’ and p”, say, are continued analytically in a loop around 
the branch point a in the positive (anti-clockwise) direction they return 
as two other branches, P’ and Pp’, say. But then, by the second defining 
property of a P-function, 


rR , as 
P =a,;P +@P 


~I 


P" =a3P’ +a4P 


for some constants d), a>, 43, d4, So the matrix a, a describes 
A= 


az aa 


what happens at the point a. This is very like what was described in 
Result 1. 

Let B and C be the matrices which describe the behaviour of P’ and 
P’ under analytic continuation around b and c, respectively. Then, as 
Riemann noted (in line with Result 2), a circuit of a and b can be 
regarded as a circuit of c in the opposite direction, so 


1 0 
cmr=(4°). 


So, as Riemann remarked “the coefficients of A, B and C completely 
determine the periodicity of the function”. 


Now, for definiteness, Riemann supposed a=0, b = c0,c = 1, and 


chose branches ¢2%« p(o’), p®), etc. as in (3). A circuit around a in the 


positive direction returns P@ as ¢27¢p@ and P@) as ¢2zi0’ plo’), so 


ertia 0 
A = - or i 
| 0 e2ma 


To express the effect on P and P(@) of acircuit around b = ov, he 


replaced them by their expressions in terms of P®) and P‘’), 
conducted the new expressions around ©, and then changed them 
back into P and Pp’) by writing 


pO ; Pp 
[pe =#( am) 


Then 


with a similar expression for what happens near c. 
Since k = 27/€, it follows on taking determinants that 


det(A) det(B) det(C) = 1 = eer B47), (14.7) 
which is why Riemann had assumed a+ a0’+/$+//'+y+y’ = Il. 
Equation (14.7) is an equation in z = | matrices, so it yields four 
equations for the eight entries in the two matrices P’ and C’ in terms 
of the six parameters of the P-function a@,a’,...,y’. 


Riemann wrote the equations out explicitly, and showed that four of 
the eight entries determine the other four. Indeed, he did better: in July 
1856 he wrote down how to express the entries in the matrices P’ and 


C” in terms of the six coefficients a, a’, 6, 6’, y, y’, but in the published 


paper he merely gave the various ratios —, etc. 
(04 
Y 


These were enough to prove the next result, Result 3: 


Result 3 
If P; is another P-function branched at the same three points a, b, cand 


with the same six exponents a, a’, 6,/’,y,y’ then P; is a constant 


multiple of P. 
This makes sense because, on the one hand, the branches of P and 
P, near z = | behave in the same way, and, on the other hand, the 


analytic continuation of the branches of Pand P, around the other 


singular points is given by the same matrices, so they should behave in 
exactly the same way everywhere. 
More precisely, Riemann first showed that the ratio (x, y) is 


constant in a neighbourhood of z = | and then because the exponents 
are the same the analytic continuation ofthe branch 6Z is the same as 
that of Pand so P; is aconstant multiple of P. This is where he made 


tacit use of Liouville’s theorem (see Sect. E.4). 

As Riemann then remarked, a very similar argument deals with two 
P-functions with the same branch points a, b, cand exponents the 
corresponding pairs of which differ by integers.* Now, although the 
analytic continuation is the same for the two functions, the quotient 
v(x, y) is not constant in a neighbourhood of z = 1, and instead 


(z— a) r /P* is constant in that neighbourhood for a suitable integer 


power vy. This gave him Result 4: 


Result 4 


IfP and P; are two P-functions with the same branch points a, b, c and 
exponents the corresponding pairs of which differ by integers, then P, 
is obtained from P by multiplying by powers of | — z, 1 — z,and z—c, 


the precise powers being determined by the differences in the 
exponents. 

Only now did Riemann satisfy himself that P-functions exist. He did 
this in Sect. 7, where he deduced the next result: 


Result 5 
A P-function satisfies a differential equation and when II(z) = I'(z+ 1) 


and the exponents of the P-function are suitably chosen the differential 
equationis precisely the hypergeometric equation. 

More precisely, [—J, /], Pp, = dy, and d’y are three such P- 
1 dx 2 dy 
functions and, as Riemann showed, they satisfy a linear relationship 
with coefficients that are certain rational functions in x. Explicitly, in the 
case y = I, he found that P satisfies the hypergeometric equation in 


this form?: 


2 


dy dy 
-(A+B 
d log z? ( aioe 


(1-2) +(A’- By =0. 
Accordingly, Riemann connected his P-functions with the functions 
F(a, B,y,Z) of Gauss: 


0 a 0 


F = Po 
(a, B, y,%) = const (2. a 


b (z) 


So Riemann had shown that the branching data of the P-function 
determined its monodromy relations, i e. the group generated by the 
matrices A, B, C.° Furthermore, Riemann had established that the 
hypergeometric equation is the only second-order linear equation 


whose solutions satisfy the geometric conditions of his three 
postulates. 


Riemann concluded by illuminating the relationship between P and F, 
its hypergeometric series representation. Since a@ and a’ may be 


interchanged 


a By a By 
Pl oes (Z)=P | Jo 
( apy apy 
there are eight P-functions for each hypergeometric series in z, say 
a : : 
ae ‘ a =2(1-z)yyF(6+at+y.p +at+y,a-a +1,2). 


There are six choices of variable, so 48 representations of a function as 
a P-function. 


14.4 Exercises 

1. 
Look back at the table of Kummer’s 24 solutions to the 
hypergeometric equation and determine their domain of 
convergences as functions of a complex variable. 


Find values of a, n, and J so that these solutions are only two- or 


three-, or five-valued functions. 


Questions 
1. 
The properties of Riemann’s P-functions are obviously derived 


from the properties of solutions of the hypergeometric equation. In 
what sense are they the essential properties? 


Footnotes 


1 Riemann’s notation comes from his reading of Gauss’s still unpublished paper, which 
Riemann explained in his note (1857b) he had read in Gauss’s Nachlass. Gauss had died in 
1855. 


2 This corrects a rare slip in the English translation of Riemann’s work; Riemann said, in 
somewhat obscure words, that the P-function is locally single-valued except at the point a, D, c. 


3 See Riemann ([235], 3). 


4 These are Riemann’s versions of Gauss’s contiguous functions. 


5 The logarithmic derivative —4_ satisfies _¢ ~,4 and _@  _ . goes as you can 
d\ dlog x x dlog x2 dx dx2 


og x 


check by setting wu = log x and using the fact that a = oe, 
u X du 


6 Monodromy matrices were introduced by Hermite in response to Puiseux [229], a paper that 
had carefully examined the effect of analytic continuation on a branch of an algebraic function 
around one of its branch points with a view to elucidating the integration an algebraic function 
over a closed path containing a branch point; Cauchy reported on this work in Cauchy [38]. 
Riemann did not cite this work; as was the custom of the time, Riemann seldom gave references 
—except, in his case, to Gauss and Dirichlet—but he was well read all the same, and was 
undoubtedly one of those mathematicians who absorbed the work of others and then rederived 
it in his own way. The term monodromy group was first used by Jordan in his Traité ({154], 
278) and its subsequent popularity derives from its successful use by Jordan and Klein in the 
1870s. 
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15.1 Introduction 


A second-order linear ordinary differential equation has a basis of two 
solutions, and it was to turn out that their quotient has interesting 
properties. Indeed, the set-theoretic inverse function has still more 
interesting properties that bring out a strong family resemblance 
between the regular or Platonic solids, plane lattices, and tessellations 
of the non-Euclidean disc. 


15.2 Quotients of Solutions 


First, we write the hypergeometric equation, or indeed an arbitrary 
second-order linear ordinary differential equation, in the form 


f" + pf’ +af =9, (15.1) 


where p and q are functions of z. 

To investigate the behaviour of two linearly independent solutions u 
and v of. (15.1), it turned out to be convenient to introduce their 
quotient a, y, z, f. Then 


It is important to observe that 
= , , u _Yv = 
w=O08uv-uv =08—-=—-—euH=h, 
uy 


for some constant k. Because u and v are a basis of solutions, we infer 
that w’ never vanishes and so wis locally one-to-one away, that is, from 


any singularities of u or v. 
Now we restrict our attention to the hypergeometric equation, for 
which 


Pe eet es oem 9) = ap 
edz 7 = 2) 


We are going to follow Schwarz and make some assumptions about a, 


Pp 


n,and J as the argument proceeds. We start by assuming that they are 


all real. This means that the solutions of the hypergeometric equation 
take real values when z is restricted to real values. 

In 1872, Schwarz investigated the question of when the solutions to 
this differential equation are algebraic functions of z. That is to say, 
they are functions w(z) such that there is polynomial equation 
F(z, w) = O. (For example, in real variables, the function 


f(x,y) = x? 4+ a — | is an algebraic function of x and y.) This means 


that for every value of z there is a finite number of values of w. We have 
seen that the quotient can be made to take real values, so the images of 
the segments —dy/dx, (0,1),and N = 1 can be real. 


Schwarz knew from Gauss’s work and Riemann’s that the solutions 
of the hypergeometric equation have three singular points, at 0, 1, 0, 


and that the solutions are of the form (a;,b;) near x = 0, 


(1 — x)“f;(1 — x) near x = O, and (1/x)“f..(1/x) near 4k + 1, where 


the exponent a is one of the numbers in Kummer’s table and the fs are 
analytic functions that do not vanish at the point in question. 

When a function is algebraic there are only finitely many values of w 
for each value of z. The quotient of two solutions of the hypergeometric 
equation, like the individual solutions, is usually many valued, because 
itis of the form y; times a holomorphic function. This immediately 


implies that the exponents a must be rational numbers, and this 
imposes conditions on the coefficients a, 6, y of the hypergeometric 


equation. Schwarz found that on setting 
(l-yy =a, (@-By =H’, (y-a- fy =y’, 
where y(a) and y are real and positive, the image of the upper half- 


plane under a quotient of two solutions of the hypergeometric 
equation has angles of Az, Az, and vz. After imposing some further 


restrictions on a, n,and J, he could even insist that 1, 2, and v be 


reciprocals of integers. We shall now follow him part of the way. 
Schwarz observed that two linearly independent solutions of a 
hypergeometric equation that are each algebraic will have a quotient 
that is algebraic, because it can only have finitely many values at each 
point z. 
The quotient is singular at the points 0, 1, co where the individual 


solutions are singular. Otherwise, because q@, n, and / are real, the 


hypergeometric equation has solutions that are real-valued on each of 
the segments —dy/dx, (0,1),and N = 1 of the real axis. The same is, 


therefore, true of the quotient: it can take real values on the real axis. 
But, because the quotient is many-valued, these segments have other 
images. 

Before we find what they are, let us look at the effect of the quotient 
of a neighbourhood of the singular points. The upper half line can be 
considered as a triangle with vertices at 0, 1, oo, where the angles at 


each vertex are 2. We can now see that the quotient w maps these 
angles to angles u = x — Iz. 
Now we need to check some things: 


° that each segment is mapped monotonically onto its image, 

° that no two of the sides cross, and 

° that the upper half-plane is mapped onto the interior of the 
triangle. 


These properties hold because we are dealing with a second-order 
ordinary differential equation, and so we know that w’ can only vanish 


at the singular points, and elsewhere is one to one. In particular, it is 
monotonic on the segments. 
To find the images of the —dy/dx, (0,1), and N = 1, in general, we 


let a basis of solutions consist of two functions, say u(z) and v(z), and 
look at the quotient w(z) = u(z)/v(z). Suppose the complex variable z is 


taken on a loop and returns to its starting point. The functions u(z) and 
v(z) return as new solutions, and therefore can be expressed as linear 
combinations of a(z) and v(z), say functions, say z = Ax + By. and 


cu(z) + dv(z), where a, b,c, and d are constants determined by the 


loop. So the quotient returns as 
aw(z) + b 
cw(z) +d 


So w has been subject to a Mébius transformation. 

Now we use the fact that a Mobius transformation sends a straight 
line or a circle to a straight line or a circle (see Appendix F). So if the 
images of one of these segments can be a straight line segment, and any 
other image is the transform of that one by a Mobius transformation, 
then the other images are either straight lines or circles. 

Consider now the segments —dy/dx and (0, 1). They meet at an 


angle of 2 at x = 0, and locally the quotient is a complex function 


multiplied by a power of z, say z*. So these two segments are mapped 


to straight lines meeting at an angle of Az. What happens at the other 


two meeting points is this: the argument about the angle is exactly the 
same, but the segment joining them is the images of the real axis not by 
w but some Mobius transformation of w, so it is a straight line or 
circular arc. As a result, the image of the whole upper plane is a 
straight-sided or circular-arc triangle with angles of Az, Az, and vz. 


Analytic continuation moves this region around by Mobius 
transformations, and the result is a net of circular-arc triangles, which 
are the images of the upper and lower half-planes by w as z is 
conducted around the plane, avoiding the branch points at z = 0, 1, ~. 


It is one thing to know that the image of the upper (or lower) half- 
plane by the quotient w(z) is a circular-arc triangle, and another to 
know what happens as the variable z is led on one loop after another. 
We get a succession of images of the upper and lower half-planes and 
so a succession of circular-arc triangles. How do they fit together? Do 
they fit together like pieces of a jigsaw puzzle, or do they overlap any 
old how? 

The crucial consideration is the angles at the vertices. If one of the 
angles is, say 7/6, then it is reasonable to suspect that 12 will fit 


together at the vertex, each obtained from the one before by a turn 
through 77/6, and this is indeed what happens. If a vertical angle is 7/2 


for some integer n then 2n fit together at the vertex, each obtained from 
the one before by a turn through 7/2. So if all three vertical angles are 


of the form 7/2 for integers n then we expect that successive images 


form a web of triangles. But if any vertical angle is not of that form then 
complicated overlaps will occur, which is why Schwarz (and Poincaré 
after him) excluded those cases. 

Suppose, for example, that the quotient we are looking at maps the 
upper half-plane to a triangle with angles 2/2, 2/2, and 7/6. These 


angles sum to J, so all three sides of the image will be straight. Suppose 
for definiteness that 

° the point z = | is mapped to the point A where the angle is 2/2. 

e the point z = | is mapped to the point B where the angle is 7/2. 


° the point NV = | is mapped to the point C where the angle is 7/6. 


Suppose that z goes on a path starting from xy” and exiting the 
upper half-plane between z = | and z = |. Then the image of the 
lower half-plane will be a triangle ABC’ congruent to triangle ABC and 


attached along the edge AB. 
Let z re-enter the upper half-plane between z = | and N = 1. The 


new image of the upper half-plane will be another triangle BC’A’ 
congruent to ABC and attached to the previous one along the edge BC’. 


The process continues in the fashion shown in Fig. 15.1. 


Fig. 15.1 In every triangle the angles are either 7/2 or 7/2 or 7/6, and cumulatively they 


cover the plane 


Each light triangle is the image of the upper half-plane, and each 
dark triangle the image of the lower half-plane. 

The figure may equally well be regarded as a web of equilateral 
triangles (each containing three light and three dark triangles, such as 
ABD and BCD), corresponding to a different hypergeometric equation 
in which the angles at the vertices are all 2/2. 


It can also be seen as a web of parallelograms (made by joining two 
equilateral triangles together, such as ABCD). Now we have a figure with 
four vertices, so we are no longer dealing with the hypergeometric 
equation but with a different ordinary differential equation. 

In each of these cases, the point xy” has an image in the first, 


third, fifth, ..., triangles, and the point v + dv has an image in the 


second, fourth, sixth, ..., triangles. We get some sort of a map from the 
complex plane to the web of triangles we have constructed, but it is not 
easy to say what is the image of x""y” because it seems to have infinitely 


many images. Suppose for a moment that they have been marked 
P = Zx.q = Zy, each of them in the same position in the appropriate 


triangle. 

What is much clearer is the set-theoretic inverse map, from the web 
of triangles or parallelograms back to the complex plane. This is 
reminiscent of the arcsin function. There are infinitely many angles 
whose sine is “, but the sine function treats them alike: 


sin(z7/6) = sin(z/6 + 2k7) = ; = sin(5a/6 + 2k7)... 


for any integer k. 
For definiteness, let us return to the example where the upper and 
lower half-planes are each mapped to triangles with angles 2/2, 7/2, 


and 7/6. In each light triangle is a point (7, 0). The function that is the 


set-theoretic inverse of ~ in this case maps them all to the point. 


Similarly, g has mapped any given point ¢ in the upper half-plane to a 


point in each light triangle, and the inverse function maps all those 
points back to ¢. 


In the parallelogram case, the simplest thing that can happen is that 
every point ¢ in the upper half-plane has two images in each 


parallelogram.! The inverse function is now one that takes every value 
twice (this is again reminiscent of the sine function). But what is more 
interesting is that the points where it takes a given value form into two 
families. In each family, the points form a lattice; they are separated by 
integer multiples of the lengths AB and AD. If we represent AB by the 
complex number 0Z and AD by the complex number 6Z, then it follows 


that the inverse function—let us call it F—satisfies these equations: 


F(z) = F(z + w) = F(z + wz) 


for every z. By analogy with the sine function, the function F is said to 
be doubly periodic. It is also an elliptic function.” 

Now, Schwarz’s problem was to arrange for there to be only finitely 
many image triangles, so the condition he was led to discover was that 
y = m(z)x + z. The only triangles that meet this condition are the so- 


called digons (triangles with angles 7/2, 2/2, 7/n that fit together like 


2n segments of an orange bisected at the equator) and the triangles 
that fit together to form the faces of one of the regular solids (so the 
triangles are a decomposition of a regular solid into congruent 
triangles). 

If we set the digons aside, the only examples of congruent triangles 
with an angle sum greater than J and angles of the form | — z for some 


integer n—the condition that n of them meet at each vertex—are these: 


° angles of | —z, | —z,and | —z. 
e angles of 2/2, m/2, and 7/2. 
° angles of 27/5, 27/5, and 27/5. 


These form the faces of the regular tetrahedron, octahedron, and 
icosahedron (see Fig. 15.2), respectively, as they appear on the sphere. 


Fig. 15.2 In every triangle the angles are all 27/5, so the triangles are all congruent, in Nash 


({203], 101), copyright Elsevier (2014) 


Schwarz also noticed that when y = m(z)x + z the triangles are 


Euclidean and fit together to cover the plane. And he gave one example 
when y = m(z)x+Z: d= 7 i= i, y= 4 (Fig. 15.3), but he missed its 
true significance, and was to become very cross with Poincaré when he 
pointed it out. Poincaré’s entirely reasonable opinion, expressed to a 
mutual acquaintance, was that missing this discovery was Schwarz’s 
fault, and there was nothing that could be done about it. 


Fig. 15.3 The triangles with angles :; cover a disc, Schwarz Gesammelte Mathematische 


za 
4°2 


Abhandlungen, vol. 2, 240 


15.3 Exercises 


1. 
What nets of congruent triangles can you draw on the surface of a 
sphere? What are the angles at each vertex? 


What nets of congruent triangles can you draw on the plane? What 
are the angles at each vertex? 


What do you make of Schwarz’s net in Fig. 15.3? Look on the web 
for other figures of this kind. 


Questions 
1. We have seen that the map ¢(z) = z!/*, which maps the complex z- 


plane to the complex ¢-plane, is not too hard to understand, but 


even this simple case can be confusing. The set-theoretic inverse 
map z(¢) = ¢* isa much simpler 2: 1 map. If you accept that a 


quotient of solutions to the hypergeometric equation maps the 
upper half-plane to a triangle, what does its set-theoretic inverse 
do? 


What properties does the set-theoretic inverse map have when 
considered as a map of the whole net of triangles? 


Footnotes 


1 This is a consequence of the Cauchy integral theorem in complex function theory. 


2 The efforts to identify doubly periodic and elliptic functions involved many mathematicians 
from Gauss to Riemann and Weierstrass. The connection between elliptic integrals—integrals 
of the form iM _di_ where f(t) is a quartic in t—and elliptic functions was one of the great 

20 VFO 
discoveries of Abel and Jacobi in the 1820s. See, for example, Botttazzini and Gray [22] and 
more briefly Gray [127]. 
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16.1 Introduction 


Poincaré (Fig. 16.1) took up the hypergeometric equation in order to 
enter a prize competition, but it led him to his discovery of what he 
called Fuchsian functions and the role of non-Euclidean geometry in 
complex analysis, and it made his name as a mathematician of the first 
rank.! 


16.2 Poincaré and Linear Ordinary 


Differential Equations 


Henri Poincaré was born at Nancy on 29 April 1854, the son of a 
professor of medicine at the university there. Apparently, he had a 
happy childhood, and his mother, a very active and intelligent women, 
consistently encouraged him intellectually. His brilliance at 
mathematics became apparent in the final years at school and he 
entered the Ecole Polytechnique at the top of his class, despite a poor 
performance in drawing. He had a lifelong capacity to immerse himself 
completely in abstract thought; it was said of him that he thought all the 


time. Although he was extremely prolific, he seldom bothered to resort 
to pen and paper, he disliked taking notes and gave the impression of 
taking ideas in directly, and having a perfect memory for details of all 
kinds. When asked to solve a problem he could reply, it was said, with 
the swiftness of an arrow. 
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Fig. 16.1 Jules Henri Poincaré (1854-1912) 


In 1875, he graduated from the Ecole Polytechnique, only second 
because of another low mark in drawing, and proceeded to the Ecole 
des Mines. When he graduated from there he became a mining 
inspector, and had to write a report on a mining disaster in which 22 
people were killed. In 1878, he presented his doctoral thesis to the 
faculty of Paris on the subject of partial differential equations. Darboux, 
one of the examiners and an early supporter of Poincaré, said that the 
thesis contained enough ideas for several good theses, although some 
points in it still needed to be corrected or made precise; Poincaré never 
did this. By now he had decided on a career as a mathematician, and by 
December 1879 he was in charge of the analysis course at the Faculty of 
Sciences at Caen. 

That year the prize competition of the Académie des Sciences called 
for a contribution to the theory of differential equations, and on March 
22 Poincaré submitted his essay ‘Mémoire sur les courbes définies par 


une €quation différentielle’. In it he considered first-order non-linear 
differential equations 


dx _ dy 
XG) Y@y) 


where X and Y are real polynomial functions of real variables x and y, 
and investigated the global properties of their solutions. But he 
withdrew the essay on 14 June 1880, and the examiners never reported 
on it. 

Instead, he turned his attention to some work by the German 
mathematician Lazarus Fuchs, and submitted an essay on a topic 
connected with it on 28 May 1880. Fuchs was a former student of 
Kummer’s who had then taken up Weierstrass’s complex function 
theory, and was particularly interested in linear differential equations 
in the complex domain. In his major papers (1865) and (1866), he had 
shown how to generalise Riemann’s insights into the hypergeometric 
equation to linear ordinary differential equations of any order, and he 
had successfully characterised those equations all of whose solutions 
are holomorphic everywhere except for a finite number of points where 
the coefficients of the differential equation become infinite. 

In 1880, Fuchs had returned to the topic with a new question, and 
as part of this work he had considered a second-order linear 


differential equation with a basis of solutions xy" and xy” and 


investigated what happens to their quotient €(z) = Ate under analytic 
(Z 


continuation. It is clear that when z is taken on a loop that encloses a 
singular point, the quotient returns in the form 


ay fi(Z) + ay2fo(z) 
doi f(z) + ar fo(z)’ 


and so ¢ isa many-valued function of z. 


Poincaré read Fuchs’s work, but was not persuaded by it. He 
entered into a correspondence with Fuchs that is too technical to 
describe here.? But it is clear that while Fuchs was deeply immersed in 
Weierstrass’s complex function theory, with its insistence on power 


series methods, which are essentially local in scope, Poincaré had 
immediately picked up on the global nature of the solutions to 
differential equations. 

Poincaré looked at the simplest cases, among them the 
hypergeometric equation, which has three singular points. Suppose one 
of the singular points is at the origin, and a basis of solutions is given by 
the functions N = 3n and N = 3n, where @, and 64; are holomorphic 


and non-zero in a neighbourhood of the origin, then their quotient is 
ZP1P2 Fi (z)/ fo(z). The factor v = x — ay is holomorphic and non-zero 


near the origin, so the behaviour of the quotient near the origin is 
governed by the exponent difference, p; — (2. 


Poincaré wrote to Fuchs to say that if the exponent differences were 
a,, Qj, and a, at infinity, then either adx = zdz + zdx, in which case z 


is rational in 2, or adx = zdz + zdx, in which case z was doubly 


periodic. Even in this case there were difficulties, and Poincaré supplied 
an example to show that Fuchs’s theorem was still wrong. But, ifa 
requirement that Fuchs had imposed on the differential equations he 
was considering was dropped, then the case adx = zdz + zdx could be 


included, which, Poincaré remarked, gives a “much greater class of 
equations than you have studied, but to which your conclusions apply. 
Unhappily my objection requires a more profound study, in that I can 
only treat two singular points”. However, z is still single-valued, and 


These functions I call Fuchsian, they solve differential equations 
with two singular points whenever qd), ad;,and a, are 


commensurable with each other. Fuchsian functions are very 
like elliptic functions, they are defined in a certain circle and are 
meromorphic inside it. 


On the other hand, he concluded, he knew nothing about what 
happened when there were more than two singular points. 

When Poincaré wrote again to him again on the 30 July his own 
researches on the new functions he led him to discover that they 


present the greatest analogy with elliptic functions, and can be 
represented as the quotient of two infinite series in infinitely 
many ways. Amongst those series are those which are entire 
series [which] converge in a certain circle and do not exist 
outside it, as thus does the Fuchsian function itself. 


and he went on to explain how the new functions were the solutions of 
an extensive class of differential equations. 

Before we turn to discuss what Fuchsian functions are, note that 
they only illuminate the study of a differential equation if they can be 
defined independently of the equation. This Poincaré did by 
introducing Fuchsian and theta-Fuchsian series. Moreover, by calling 
these functions Fuchsian and not Schwarzian Poincaré was showing 
that he had not read Schwarz’s paper. He was to be much criticised by 
Klein for this, but he refused to back down, and he only ended the 
quarrel by naming a related but new class of functions Kleinian, even 
though Klein had not had much to say about them. 

In the essay Poincaré submitted to the competition, he considered 
when the quotient ¢ = ¢(z) of two independent solutions of the 


differential equation dy 


+ Oy =0 defines, by inversion, a 
dz? ~ 


meromorphic function z of €—note that y, z, and ¢ are all complex 


variables. He showed that Fuchs’s conditions were not necessary and 
sufficient. Rather, for z to be meromorphic on some domain it was 
necessary and sufficient that the exponent differences at each singular 
point, including infinity, differ by an aliquot part of unity (i.e. 

Pil -p2= i, for some positive integer n). If the domain is to be the 


whole complex sphere then this condition is still necessary, but it is no 
longer sufficient. He found that there were too many special cases for 
Fuchs’s methods to work easily, and so he proposed to take a new 
approach, beginning with Fuchs’s example of a differential equation in 
which there are two finite singular points a; and a ;, where the 


exponent differences are 1/3 and 1/6, and the exponent difference at 
oo is 1/2. 


In this case, he found the change in z was of the form 

Z-a Z-a 

zh Z’, = eot/3 
a= 6 ia 2 

under analytic continuation around qj, and 

c= = 

ZR Zz’, eae gr (=) 
=o Z=0 


under analytic continuation around a,,and ! +, —! around o. Note 
ra ra 


that the first two of these maps keep the points a and n fixed and are 


otherwise like rotations.* 


Accordingly, z is a meromorphic single-valued function of & 


mapping a parallelogram composed of eight equilateral triangles onto 
the complex sphere, and p = Z, is its only singular point, so z is an 


elliptic function. The differential equation, Poincaré showed, has in fact 
an algebraic solution 


yy, = (x-a)' (x - a)” 


and a non-algebraic solution v; such that 


ae fo 0)" a) ae 
y1 


This result agrees with Fuchs’s theory. 
However, it might be that the domain of ¢ could not be the whole ¢ 


sphere. Poincaré showed that this could happen even when the 
differential equation had only two finite singular points. For example, if 
the exponent differences were 2, 2, and 2 at oo, then as long as z 

Ig j 


crosses no cuts ¢ stays within a quadrilateral aga’y (see his Fig. 2, p. 


86, given as Fig. 16.2). 


Fig. 16.2 A quadrilateral in the unit disc (Poincaré, Oeuvres 1, 365) 


Furthermore, however z is conducted about in its plane, ¢ cannot 


escape the circle HH’. Poincaré described the quadrilateral as 
“mixtiligne”, the circular-arc sides meet the circle HH’ at right angles. 
This geometric picture is quite general, curvilinear polygons are 
obtained with non-re-entrant angles and circular-arc sides orthogonal 
to the boundary circle. Thus, the domain of xis y = loga, and 


Poincaré then investigated whether z is meromorphic. This reduces to 
showing that, as ¢ is continued analytically, the polygons do not 


overlap. This does not occur if the angles satisfy conditions derived 
from Fuchs’s theory, unless the overlap is in the form of an annular 
region. 

However, if the angles are not re-entrant, this cannot happen, and so 
zis meromorphic. Poincaré’s proof of this is of incidental interest. He 
projected the circle HH’ stereographically onto the southern 
hemisphere of a sphere, and then projected the image orthogonally 
back onto its original plane. The circular arcs orthogonal to HH’ 
become straight lines, which renders the theorem trivial. This result 
virtually concluded Poincaré ’s essay. As he said in his letter to Fuchs, 


his understanding was limited essentially to the case of two finite 
singular points. 

Poincaré also wrote three anonymous supplements to the essay 
bearing the motto “Non inultus premor” that were received by the 
Académie on 28 June, 6 September, and 20 December.” We shall look at 
them shortly for the surprising connection they make between the 
theory of linear differential equations and non-Euclidean geometry. 

In due course, Poincaré’s essay was awarded second prize.° 
Hermite, one of the judges, said of the essay that’: 


the author successively treated two entirely different questions, 
of which he made a profound study with a talent by which the 
commission was greatly struck. The second ...concerns the 
beautiful and important researches of M. Fuchs, .... The results 
..presented some lacunas in certain cases that the author has 
recognized and drawn attention to in thus completing an 
extremely interesting analytic theory. This theory has suggested 
to him the origin of transcendents, including in particular elliptic 
functions, and which has permitted him to obtain the solutions 
to linear equations of the second order in some very general 
cases. A fertile path is there that the author has not entirely gone 
down, but which manifests an inventive and profound spirit. The 
commission can only urge him to follow up his researches in 
drawing to the attention of the Academy the excellent talent of 
which they give proof. 


16.3 Poincaré’s Breakthrough and Non- 


Euclidean Geometry 


Poincaré himself has left us one of the most justly celebrated accounts 
of the process of mathematical discovery, which concerns exactly his 
route to the theory of Fuchsian functions.® Poincaré gave this account 
in a lecture he gave to the Société de Psychologie in Paris 1908, and it 
was later published as the third essay in his volume Science et Méthode 
[221], with the title “Linvention mathématique” (Mathematical 
discovery). 


Poincaré began by doubting that Fuchsian functions could exist, but 
shortly came to the opposite view. He tells us in the lecture ([221], p. 
50) that: 


For two weeks I tried to prove that no function could exist 
analogous to those | have since called the Fuchsian functions: I 
was then totally ignorant. Every day I sat down at my desk and 
spent an hour or two there: I tried a great number of 
combinations and never arrived at any result. One evening I took 
a cup of coffee, contrary to my habit; I could not get to sleep, the 
ideas surged up in a crowd, I felt them bump against one 
another, until two of them hooked onto one another, as one 
might say, to form a stable combination. In the morning | had 
established the existence of a class of Fuchsian functions those 
which are derived from the hypergeometric series. I had only to 
write up the results, which just took me a few hours. 


Then he had to find an independent description of the new functions, 
so that it is a meaningful remark that they solve certain differential 
equations. He went on ([221], 51): 


I then wanted to represent the functions as a quotient of two 
series; this idea was perfectly conscious and deliberate; the 
analogy with elliptic functions guided me. I asked myself what 
must be the property of these series, if they exist, and came 
without difficulty to construct the series that I called theta- 
fuchsian. 


Next, he said that ([221], 51-52) 


At that moment I left Caen where I then lived, to take part ina 
geological expedition organized by the Ecole des Mines. The 
circumstances of the journey made me forget my mathematical 
work; arrived at Coutances we boarded an omnibus for I don’t 
know what journey. At the moment when I put my foot on the 
step the idea came to me, without anything in my previous 
thoughts having prepared me for it; that the transformations | 
had made use of to define the Fuchsian functions were identical 


with those of non-Euclidian geometry. I did not verify this, I did 
not have the time for it, since scarcely had I sat down in the bus 
than I resumed the conversation already begun, but I was 
entirely certain at once. On returning to Caen I verified the result 
at leisure to salve my conscience. 


The supplements go beyond the essay precisely in their use of non- 
Euclidean geometry. In the first supplement, Poincaré began by 
reviewing the tessellation of the disc by “mixtiligne” quadrilaterals 
obtained by successively operating on one, which he called Q, by 
transformations M and N. He observed (p. 9) that these transformations 
form a group, and remarked: 


There are close connections with the above considerations and 
the non-Euclidean geometry of Lobachevskii. In fact, what is a 
geometry? It is the study of a group of operations formed by the 
displacements one can apply to a figure without deforming it. In 
Euclidean geometry the group reduces to rotations and 
translations. In the pseudogeometry of Lobachevskii it is more 
complicated. Indeed, the group of operations formed by means 
of M and N is isomorphic (‘isomorphe’) to a group contained in 
the pseudogeometric group. To study the group formed by 
means of M and N is therefore to do the geometry of 
Lobachevskii. Pseudogeometry will consequently provide us 
with a convenient language for expressing what we will have to 
say about this group. (Poincaré ’s emphasis.) 


Poincaré’s realisation on boarding the bus at Coutances can be 
described very simply. He realised that the straightened version of the 
“mixtiligne” figures described at the end of his Prize essay was identical 
with the figures in Beltrami’s description of non-Euclidean geometry [5, 
6]; that, therefore, the original figures were conformally accurate 
representations of non-Euclidean figures; and finally that this meant 
the transformations formed from M and N were non-Euclidean 
isometries. Beltrami’s detailed discussion of the non-Euclidean 
differential geometry of the disc enabled Poincaré to give a new 


meaning to his previously analytical transformations. Consequently, on 
p. 20, he remarked that: 


The Fuchsian functions are to the geometry of Lobachevskii 
what the doubly periodic functions are to that of Euclid. 


Poincaré remained stuck on the case of the hypergeometric equation at 
least until his fourth letter, 30 July. Liberation came from an unexpected 
source, arithmetic ([221], 52, 53): 


I then undertook to study some arithmetical questions without 
any great result appearing and without expecting that this could 
have the least connection with my previous researches. 
Disgusted with my lack of success, I went to spend some days at 
the sea-side and thought of quite different things. One day, 
walking along the cliff, the idea came to me, always with the 
same characteristics of brevity, suddenness, and immediate 
certainly, that the arithmetical transformations of ternary 
indefinite quadratic forms were identical with those of non- 
Euclidean geometry. 

Once back at Caen I reflected on this result and drew 
consequences from it; the example of quadratic forms showed 
me that there were Fuchsian groups other than those which 
correspond to the hypergeometric series; I saw that I could 
apply them to the theory of theta-Fuchsian series, and that, as a 
consequence, there were Fuchsian functions other than those, 
which derived from the hypergeometric series, the only ones I 
knew at that time. I naturally proposed to construct all these 
functions; I laid siege systematically and carried off one after 
another all the works begun; there was one however, which still 
held out and as the chase became involved it took pride of place. 
But all my efforts only served to make me know the difficulty 
better, which was already something. All this work was quite 
conscious. 


The second supplement is given over to a more rigorous description of 
non-Euclidean geometry, and to tessellations of the disc by polygons 


with angles 72/m for integers m. When the polygon is a triangle, he also 


discussed more carefully the ways of constructing Fuchsian functions in 
this case and was led to conjecture a result which he said he was not yet 
in any state to prove—the Riemann mapping theorem! Then, on p. 17, 
he abruptly stated the connection with the theory of quadratic forms, a 
subject upon which Hermite was an expert. 

He let T be a matrix (“substitution”) with integer coefficients which 
preserved an indefinite ternary quadratic form II and S bea 
substitution sending (£7 + 77° — &* to II. Then § 7S ~! preserves 


(2 +7? —€? andsends N = 3n to (¢’,n’,&)’, say. 


The quantities 


os ¥—17 ! ie 3 v— 117 
i ——— a = ee ee 
¢ ¢ 
are related by transformation z’ = zK of the non-Euclidean plane 


provided q? + b? + c? = 0. He did not prove that a sheet of the 


hyperboloid of two sheets provides a model of non-Euclidean geometry 
—which is easy enough to establish—and remarked only that (p. 19): 
“All the points ZK are the vertices of a polygonal net obtained by 
decomposing the pseudogeometric plane into polygons 
pseudogeometrically equal to each other’. 

The third supplement, of only 12 pages, was received on 20 
December. Its main result is the extension of the method of polygonal 
decomposition to include cases where the angles are zero, and the roots 
of the indicial equation differ by integers. The notable example is 
Legendre’s equation. Poincaré ’s method is to push the polygons 
outwards until one or more vertices are “at infinity”, i.e. are on the 
boundary of the disc, and the corresponding angles vanish. Since the 
polygons hitherto studied had angles which were only rational 
multiples of 1, Poincaré ’s argument relies heavily on its geometrical 


plausibility. 


The unpublished work makes abundantly evident the astounding 
clarity of Poincaré ’s mind, coupled to an almost equally dramatic 
ignorance of contemporary mathematics. There is no mention of the 
work of Schwarz on the hypergeometric equation, nor is there any 
mention of the work of Dedekind or Klein, and even Hermite’s work on 
modular functions, which he must have known, seems to have been 
forgotten. In fact, these omissions are not mere oversights; 

Poincaré genuinely did not then know the German work. 

It marks two things: Poincaré’s arrival on the mathematical scene, 
and the recognition of non-Euclidean geometry as an important tool in 
mathematics and not merely as an unexpected feature of geometry. 
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Fig. 16.3 A tessellation of the non-Euclidean disc by triangles. (Klein, Gesammelte 
mathematische Abhandlungen, vol. III, p. 126) 


16.4 Non-Euclidean Geometry 


For us to be able to say that all the triangles in Fig. 16.3 are congruent, 
we have to define a sense of distance and a group of distance- 
preserving transformations such that corresponding sides have the 
same lengths and corresponding angles are equal. The convenient thing 


about this kind of picture of non-Euclidean geometry is that it 
represents angles correctly. This requires us to use angle-preserving 
transformations of the disc to establish congruences, so we invoke 
Mobius transformations. But what about lengths? 

To investigate a formula for distance that is invariant under such 
transformations, we begin by looking at maps of the above kind that 
also map the real axis to itself. They are Mobius transformations of the 
form 


Z+4Yr 
rz+1 


=z), reR. 


We want a formula for the distance between two points on the real line 
with the property that the distance from 0 to a is the same as the 


distance from p = 1/a to p(a) = ~*4. If we write d for distance, then 


we want d to be such that 


d(0, a) = d(u(0), (a) = d(r, 


Pra 
ra+1]} 
We notice that if r = tanhp and {x : 0 < x} then the formula we want 


to say that 
Xdx + Ydy + Zdz+ Pdp + Qdq = 0, 


This will be the case if we define d by the formula 


Ly =Z 
tanh z : 
= £129 


d(Z1,Z2) = tanh”! 


9 


for then d(O, tanha) = a and 


d(tanh p, tanh(o + @)) = tanh! tanh(o + @~a-a@) = tanh! tanha =a, 


as required. 
If we represent Euclidean distances from the centre by r and non- 
Euclidean distances from the centre by n, then the above formula says 


that 


p=tanh'(r), orr=tanhp, 
SO 
dr = sech’pdp = (1 — tanh” p)dp 


and therefore 


dr 
dp = : 
a 
This makes good sense. As a point moves outwards its distance from 
the centre tends to 1 and so | — ;” tends to zero. Therefore, equal non- 
Euclidean steps of z,, are represented by steadily smaller Euclidean 


steps of dr. 
It can now be proved that in non-Euclidean geometry: 

e geodesics appear as arcs of circles perpendicular to the boundary 
circle; 

° circles appear as circles; 

° the angle sum ofa triangle is less than 2; 

° the area of a triangle with angles a, /, y is proportional to 
Z=fO+ FW); 

° there are many parallels to a given line through a point not on 
that line. 


Theorems like these gave the new geometry its name, and 
occasioned much debate in the nineteenth century about its logical 
coherence and its physical applicability. With the profusion of new 
geometries in the twentieth century it gradually lost its position in the 
panoply of geometry, and is now usually known by the name Klein gave 
it: hyperbolic geometry. 


16.4.1 Summary 

Whether the web of triangles is on the sphere, on the plane, or on the 
non-Euclidean disc, the triangles in each case are mutually congruent. 
There is a small number of distinct spherical triangles, a small number 


of distinct Euclidean ones, and an infinite number of distinct non- 
Euclidean ones that can be used. In each case, the web can be mapped 
to itself by a group of isometries, and there are functions F on the 
sphere, plane, or non-Euclidean disc with the property that if g is a 
member of the appropriate group of isometries then dy = dz + dx. 


This generalises the periodic behaviour of the trigonometric 
functions. For example, sin(x + 2k7r) = sin(x). In this case, the sine 


function is defined on the real line and the group of integers acts as 
follows: p+ q sends y = Oto —dy/dx. 


This was not exactly how it was seen before Poincaré. Schwarz’s 
discovery of the connection between the hypergeometric equation and 
the regular solids was new. The case of elliptic functions was well 
known, but the double periodicity of these functions was not seen as 
related to an action of the group Z Ba Z, on the complex plane. 


Poincaré’s introduction of non-Euclidean geometry was wholly new 
and surprising. 


16.5 Exercises 

1. 
Show that a Mobius transformation that maps the unit disc to itself 
and fixes the points z = —1,0, and 1 is the identity map. 


Call the arc of a circle that lies inside the unit disc and meets the 
unit circle at right angles (and also any diameter of the unit circle) 
a d-line. Show that a Mobius transformation that maps the unit 
disc to itself maps d-lines to d-lines. 


Show that any Mobius transformation that maps the unit disc to 
itself cannot map a segment of a d-line to a proper subset of itself. 


Questions 
1. Explain why the above exercises show that it is possible to speak of 
the length of a segment of a d-line. Why did Poincaré regard the 


presence of a group of Mobius transformations that map the unit 
disc to itself as almost synonymous with the existence of a (non- 
Euclidean) geometry in the disc? 


Z. 
Find what you can about what is called the Kleinian view of 
geometry. 

Footnotes 


1 For more detail, see Gray [124], upon which this account is based. 


2 It did however lead to a series of papers initiating the subject of flows on surfaces, see, e.g. 
Poincaré [213, 214], and the great memoir on celestial mechanics [215]. 


3 See Gray [124, 125]. 


4 For information about coaxial circles, see Appendix F. 


5 This is the motto of Poincaré’s home town of Nancy; it means “No-one touches me with 
impunity”. The supplements are to be found in the Poincaré dossier in the Académie des 
Sciences, but for whatever reason Norlund did not publish them when he published the essay in 
Acta Mathematica, and they were not included in Poincaré ’s Oeuvres. The supplements confirm 
and greatly amplify what Poincaré said in the lecture 28 years later, and have since been 
published as Poincaré [225]. 


6 It was ranked behind one by Halphen, and was not published until Norlund edited it for Acta 
Mathematica Vol. 39, 1923, 58-93, in Oeuvres, I, 578-613. 


7 Quoted in Poincaré, Oeuvres, II, 73. 


8 For more detail, see Poincaré [225]. 
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17.1 Introduction 


As clarity grew about the existence of solutions for linear ordinary 


differential equations, and the existence of solution methods for them 


advanced, it became clear to mathematicians that the corresponding 
story for partial differential equations was woefully underdeveloped. 
The first to prove any kind of general existence theorem for partial 


differential equations was Cauchy, who in 1819 successfully treated the 


general first-order partial differential equation. His work on initial 


conditions established the framework that later became known as the 
Cauchy problem. Ampére also did important work on the subject at the 


same time in his [2]. Cauchy then returned to the topic in a paper of 


1842 and gave an argument to show that a partial differential equation 
of any order defined by one or more analytic equations has an analytic 
solution in a neighbourhood of a suitably chosen initial curve (or, if the 


equation has more than two independent variables, a hypersurface). 


This is the origin of the idea of the Cauchy hypersurface condition for 


hyperbolic partial differential equations, but Cauchy’s account left 
much for later mathematicians to do. 


Cauchy’s ideas were rediscovered and extended by Sonya 
Kovalevskaya, who also documented unexpected issues with initial 
conditions, and how it can fail, and in the 1870s Darboux further 
improved the analysis. 

Meanwhile, in the 1860s, Riemann had given a clear example of how 
the method of characteristics can show that solutions will cease to 
exist, and applied this in the study of the propagation of sound to show 
how shock waves can form. His paper, which is discussed in Chap. 22, 
also contains innovative ideas about the use of Green’s function 
methods in the study of hyperbolic partial differential equations. 


17.2 Cauchy’s Method in 1819 


Cauchy was a student at the Ecole Polytechnique from 1805 to 1807, 
where he was taught analysis by Lacroix, whose Traité Elementaire de 
Calcul Différentiel et Intégral was required reading.' The account it 
gives of partial differential equations was by then fairly standard, and 
owed much to Monge’s approach—as one might expect at the Ecole 
Polytechnique. The equation 


Pp +Qq=R, 


for a function z(x, y), where P, Q, and R are functions of x, y, and zZ, 
— os and g = 2 was solved by eliminating p from the equation and 
x y 


the identity dz = pdx + qdy to obtain 


Pdz— Rdx = q(Pdy — Qdx). 


Lacroix then distinguished the simpler case, where the differential 
z = 0, 1, 9 only involves x and z and the differential Pdy — Odx only 


involves x and y, from the general case. The simple case is solved by the 
method of integrating factors, but in the general case the method fails, 
although ad hoc changes of variable can sometimes help, as 

Lacroix proceeded to show. Monge’s geometric approach was relegated 
to a footnote in Sect. 348, where it was described as “very ingenious”. 


Monge’s geometric method was redescribed by Cauchy, who 
explained it at length for equations in two variables and then showed 
how to overcome the problems of extending it to any number of 
variables. You can read Cauchy’s paper [34] in Sect. 31.1; it makes an 
instructive comparison with that of Monge.’ It is likely, given Cauchy’s 
growing appetite for mathematics, that he read Monge’s account. 

Neither Monge nor Cauchy specified what conditions on the 
function defining the partial differential equation are necessary for 
their proofs to work, but it is likely that Monge assumed that everything 
is analytic in something like the sense that every function admits a 
power series expansion, and that Cauchy assumed that functions were 
no more differentiable than necessary. That would put his paper of 
1819 ona par with his paper a couple of years later on ordinary 
differential equations and with his introduction of epsilon-delta 
analysis at the Ecole Polytechnique in 1821. That said, as was typical in 
Cauchy’s work, he let conditions on the function f emerge in the course 
of the proof. In fact, although his [34] is an existence proof Cauchy 
never used the term “exist” and never stipulated what hypotheses on f 
he used, namely, that fbe continuously differentiable. 

One of the assessment questions on this part of the course is to give 
an account of Cauchy’s proof in his paper [34] that first-order partial 
differential equations have solutions; there is a translation of the paper 
in Chap. 31. It would therefore be inappropriate to give a detailed 
explanation of it here, but we can note that it opens with a clever 
change of variables argument that greatly simplifies the equation to be 
solved, then there is an investigation of a necessary condition, then a 
quick, analytical derivation of the equations that Monge had exhibited a 
decade before, and then an investigation of the initial conditions. 


17.3 Cauchy and the General Partial 


Differential Equation 


We must also be sketchy in our account of Cauchy’s papers of 1842 on 
the existence of solutions to the general partial differential equation for 
two reasons: it isa much more difficult paper, and what Cauchy 
provided is itself little more than an outline at crucial points. 


He had already stated his aims in the paper (Cauchy [36]) (I take 
this translation from Cooke ([47], 25): 


In the theory of equations mathematicians have properly 
considered fundamental the question whether every equation 
has a root. Similarly in the integral calculus one of the most 
important questions, a fundamental question, is obviously 
whether every ordinary or partial differential equation can be 
integrated. But - and this ought to surprise us at first sight — 
despite the numerous works of mathematicians on the integral 
calculus, this question, important though it be, is nowhere 
solved in full generality. To be sure the existence of general 
integrals of ordinary differential equations, which contain only 
one independent variable, is now established by two different 
methods which I have given, the first in my lectures at the Ecole 
Polytechnique, the second in a lithographed memoir of 1835. .... 
In addition, the existence of general integrals of partial 
differential equations is established in certain cases where one 
is able to integrate these equations, for example when the 
equation reduces either to a single equation of first order or to 
linear equations in which the coefficients of the unknowns and 
of their derivatives remain constant. But does an arbitrary 
system of ordinary or partial differential equations always admit 
a corresponding system of general integrals? Such is the 
problem which seemed to me worthy of the attention of 
mathematicians. The present solution is based on 
considerations which I shall explain briefly. 

For a long time mathematicians, supposing without proof 
that every ordinary or partial differential equation admits a 
general integral, have considered Taylor’s formula as the means 
of developing this integral in a series of increasing integer 
powers of an increment i given to an independent variable t, 
which can be considered as representing time. Further, using a 
theorem which I proved in 1831 relating to the development of 
functions, one can be sure that in the case where the series so 
obtained is convergent, the sum of the series satisfies, as an 
integral, the ordinary or partial differential equation, at least for 


real or complex values of the increment i whose moduli do not 
exceed a fixed bound. Moreover the same remark applies to the 
sums of the series obtained when, assuming the existence of 
general integrals of a system of ordinary or partial differential 
equations, one sets about developing them in Taylor series. But 
in all cases it remains to be proved that the series so obtained is 
convergent, at least for i of sufficiently small modulus. Now this 
end can be achieved using a fundamental theorem which not 
only determines a bound beneath which the modulus of i may 
vary arbitrarily without causing the series obtained to diverge, 
but also determines a bound on the error caused by terminating 
each series after a certain number of terms. The proof of this 
theorem is based, as will subsequently be seen, on the principles 
of the new calculus which I have called “calcul des limites” and 
on a device of analysis which can be given many useful 
applications. 


Cauchy then went on to develop his “calcul des limites”, which we call 
the method of majorants, for determining if a series converges by 
comparison with another series that has larger terms but does 
converge. 

An application of these ideas to the theory of partial differential 
equations came in his [37]. He began by claiming that any partial 
differential equation can be reduced to a system of first-order (and 
what we would call) quasi-linear partial differential equations by 
introducing more unknown functions, and then said that therefore it 
was enough to show how to solve a system of such first-order partial 
differential equations. 

In his [37] gave a careful argument to show that a single such 
equation can be solved if the equation has analytic coefficients and 
certain analytic initial data is given, and in his [37] he dealt with a 
system of such equations. 

In his [37], Cauchy set himself the task of showing that a partial 
differential equation of the form 


Up = AyUy, + AgUy, +++ + Aylly, +V (17.1) 


for an unknown function u has a solution, where z = Ax + By. andv 
are analytic functions of the independent variables M — iN = T, and t, 


and the value of u is prescribed in a neighbourhood of a point in which 
1 — z,a constant. Thus, the initial data is the value of 


u(X1,X2,...,X,,T) and its partial derivatives with respect to the other 
variables xo, X1,...,X,, namely, Ux (X1, Nip ais eet Sa OL. 


He now investigated the consequences of assuming the partial 
differential equation has a solution that is a power series in powers of 
x’"y" when t equals some value T. This means that near | — z the 


solution u is a function w of the form 
wHl+hG-n)+bG=T1y ++. (17.2) 


The coefficient /,, is given by 


D" 
=, 
where Dp is the nth derivative of w with respect to t evaluated at the 
point | — z. His task was now to show that this series converges for 
suitably small absolute values of x’"y”. 

Cauchy interpreted the partial differential equation as saying that 

Dp = QD t OgD 4’ +O, DEV 

and so the coefficients /,, are expressions in various Dy; acting on 


various a;. His question now was how to estimate them and obtain the 


convergence result that he wanted. 
He let the variables vary by small amounts and considered the 
maximum effect their variation has on the coefficients z = Ax + By. 


and v. He then observed that this effect is produced by a particularly 


simple partial differential equation and so a study of this partial 
differential equation could be used to study Eq. (17.1). This is the 
equation of the same form as Eq. (17.1) in which 


= a a 
Af X1, X2,.--5 Xn t, W) = A(XjX2...X,tw), J=l,...,n 
in which @,Q,...Q@, are constants. 


This is an equation of the kind that his paper of 1819 applies to, and 
it can be solved by passing to a system of ordinary differential 
equations. These equations have a solution for some non-zero values of 
the variables. This solution dominates the conjectured power series 
solution (17.2), and so it converges and, as Cauchy checked, it defines a 
solution of the original partial differential equation (17.1) ina 
neighbourhood of the given system of initial values. 

As Cooke ([47], 27) remarked, Cauchy did not discuss the 
uniqueness of the solution, when it exists. He seems to have assumed 
that the solution is determined by the initial values. But these papers 
form what is regarded as Cauchy’s contribution to the Cauchy- 
Kovalevskaya theorem, so historical generosity seems to have been at 
work. 


17.4 Kovalevskaya’s Theorem and Her 


Counter-Example 


In 1875, the Russian mathematician Sonya Kovalevskaya (Fig. 17.1) 
published a paper in the Berlin Journal ftir die reine und angewandte 
Mathematik [165] that conveyed the results she had written the year 
before as a private student of Weierstrass’s in Berlin, and would 
undoubtedly have led to the award of a Ph.D. at the University of Berlin 
had women been eligible to study for degrees there at all. But they were 
not, and so Weierstrass persuaded one of his former students, Lazarus 
Fuchs, by then a professor at Gottingen, to see that she was awarded a 
Ph.D. there.° 
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Fig. 17.1 Sonya Kovalevskaya (1850-1891). Acta Mathematica, Table générale des tomes 1-3, 
1882-1912, p. 153 


In this paper, Kovalevskaya gave a new proof of Cauchy’s theorem 
on the existence of solutions to a first-order quasi-linear partial 
differential equation in the analytic case. She says that she had learned 
this result from Weierstrass’s lectures, and it seems that neither of 
them knew of Cauchy’s much earlier proof of the same result. Her 
ignorance is understandable, although she does mention work by Briot 
and Bouquet [23], who do cite Cauchy, but it is striking that 
Weierstrass could claim that he did not. 

She also indicated how the theorem might be extended to systems 
of partial differential equations, and therefore to partial differential 
equations defined by a polynomial equation in n variables and the 
partial derivatives of the unknown function. She showed that there is 
always a convergent power series in the variables about a point 
(a1, d2,..-, Qn) that satisfies the equation at that point, and that if such 


a function satisfies the partial differential equation then the 
coefficients of its Taylor series expansion can be determined from the 


partial differential equation. This theorem, which was independent of 
Cauchy’s, is her contribution to the Cauchy-Kovalevskaya theorem. 
She then went on to surprise her supervisor with a novel, and 
indeed disturbing, observation about initial conditions for partial 
differential equations. Her example ([165], 22) was all the more 
disturbing because it concerned the one-dimensional heat equation, 
which one might have thought was well understood. The equation is 


Uu; cs Uxx5 
and everything could be expected to be well behaved if 1 — ap and 
u;(Xo, t) are given as analytic functions of t. Kovalevskaya took as initial 


conditions the requirement that y4(x,0) = (x — 1)~!, and observed that 


the partial differential equation u; = u,, is formally solved by the 
infinite series 
> fl ). 

2] 
>) (a 
which reduces to the function f(x) = (x — 1)~! when t = 0. However, 
the power series solution diverges for all ¢ = 0. 


It might be objected that the boundary condition involves a function 
that becomes infinite when x = 0. Could this be the reason that the 


solutions diverge? It is not, for, as Cooke ([47], 33) observes, one can 
conduct a similar analysis when f(x) = (1 + Saal ae In this case, the 


formal solution to the partial differential equation is 


(Qm+2n)! , 


oo y De (2m)!n! 


m,n=0 


When tf = O this reduces to the series 


u(x,0) = ) (Dx, 
m=0 


which is indeed the power series for (] + a a series that is never 


infinite (provided x is real, which the partial differential equation 
surely requires). However, when x = 0 the series reduces to 


= 2n)! 
10,1) = Dv, 
n=0 ; 


which diverges for all t = 0. 


Cooke goes on to quote a solution in the form of a Fourier integral: 
[o.e) a5 > 
u(x, t) = i. ee’ cos(xy)dy, 
0 


which is analytic only if t = 0. 


That year, 1875, the French were also caught ill-informed. The 
young Gaston Darboux published a four-page paper in the Comptes 
Rendus For January 1875 in which he rederived Cauchy’s result from 
1835 on the solution of ordinary differential equations and extended it 
to partial differential equations, noting that Briot and Bouquet had 
given a new proof and explored the consequences of the theorem. He 
also noted that one still lacked a perfectly general theory of equations 
of this kind, and promised a subsequent paper in which he would 
explain the theory of characteristics, which he attributed to Monge. 

Darboux’s argument was not that different from Cauchy’s: first 
show that there is a formal power series that satisfies the equation and 
the given initial conditions; second show that for a certain range of the 
variables this series converges. 

Within the month the Italian mathematician Angelo Genocchi had 
written in with “some observations”. He admired Darboux’s talent, but 
he noted that Cauchy had written about the problem for systems of 
partial differential equations in a series of papers in the Comptes 


Rendus for 1842, so Cauchy deserved the credit for the first proof. Then 
in 1873 the French mathematician Puiseux had made some important 
remarks about implicit functions that an Italian mathematician called 
Félix Chio had amplified, also in the Comptes Rendus. Genocchi added 
that Cauchy also deserved credit for the theory of higher dimensional 
spaces “about which there is so much noise at present”, and that there 
was also the delicate point discussed by German mathematicians of 
what they called “convergence in equal degree” (and we call uniform 
convergence today). 

Genocchi’s note was published, and the perpetual secretary of the 
Académie des Sciences, Joseph Bertrand, took the opportunity to press 
for the prompt publication of the Oeuvres of Cauchy. The opportunity 
was also given to Darboux to further develop his method. 

All this burst of activity came as a surprise to Weierstrass, as Cooke 
relates. It seems that Weierstrass had not renewed his subscription to 
the Comptes Rendus on time, and so only got the relevant copies of it 
some time after they had appeared in early 1875. He quickly wrote to 
Kovalevskaya to tell her what was going on Cooke ([47], 35): 


So you see, my dear, that this question is one which is awaiting 
an answer, and I am very glad that my student was able to 
anticipate her rivals in time and at least not fall behind them in 
working out the problem. 

Darboux mentions several exceptional cases which are of 
special interest; I am inclined to think that he also has 


encountered the difficulties (as in the equation dg _ #y) which 
Ot Ax 


gave you so much trouble at first and which you later overcame 
so successfully. 


He also sent a copy of her dissertation to Hermite, Darboux’s mentor, 
and it so impressed both men that they became staunch advocates for 
her later on. This surely contributed to the high opinion that Poincaré 
was to have of her work. In the course of his prize-winning paper on 
celestial mechanics that made his name internationally, he wrote 
(Poincaré [215], Sect. 3): “Mme Kovalevski has considerably simplified 
Cauchy’s demonstration and has given the theorem its definitive form”. 


It might seem that this means only that Kovalevskaya improved on 
Cauchy’s method of proof, and indicated that it can break down. In fact, 
twentieth-century mathematicians found that her theorem pointed to a 
number of subtle developments of which one is worth mentioning here 
precisely because it was not sufficiently appreciated at the times: 
boundary conditions matter greatly when solving partial differential 
equations. We shall return to this point in later chapters. 


17.5 Exercises 


The central points of this lecture are to establish that Cauchy opened up 
the study of general partial differential equations (of order greater than 
one) but only in the analytic case, and that by a more careful analysis 
Kovalevskaya was able to show that his method of finding a transversal 
hypersurface did not always work. 

It seems to me that these points would be obscured by working 
through mathematical examples, so none are provided, except for this 
one: 


1. 
Find examples of power series with a zero radius of convergence. 


Questions 


ie 
What does the reception of Cauchy’s ideas about partial differential 


equations tell us about how mathematical ideas circulated in the 
mid-nineteenth century? 


Footnotes 


1 The Ecole Polytechnique had just been reorganised by Napoleon as a military school; Cauchy 
entered third in the ranking of the 125 entrants. 


2 The paper is reprinted in his Oeuvres series 2, volume 2, 1958. 


3 German universities were admirably relaxed about where people had studied for their 
degree. 
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18.1 Introduction 


Green introduced the functions that have come to bear his name in an 
attempt to solve problems in potential theory. Here we shall see how he 
used them, and how Dirichlet and Riemann used them to study 
Laplace’s equation. In this chapter, more than is often the case when 
dealing with mathematics as it was discovered, the results are 
imprecise and quite some effort from later mathematicians was needed 
to make them rigorous. 


18.2 Green’s Theorems and Green’s Functions 


Green introduced these functions in his famous Essay [128], where he 
claimed that given the value of a function VY on aclosed surface C 


there is a unique continuous extension to a function V defined on the 
interior which satisfies the Laplace equation and has no singular values 
inside the surface. This is, of course, the Dirichlet problem (before 
Dirichlet). 


He defended this claim as follows (I have somewhat modified his 
language). He supposed that there is a function U that is a harmonic 
except at the point P, where it becomes infinite like 1/r near the origin 
and is zero on the surface C.. Then it followed from a general theorem 


of his that 


where the expression a denotes the normal derivative of the function 


U. As he remarked, this shows that the value of V at P is known when its 
values are known on the surface. 

Note, however, that Green’s function U depends on a parameter that 
locates the singular point. If you can determine a Green’s function for a 
given region that has its singular point at an arbitrary point P then this 
formula does indeed define a harmonic function in the region—but it is 
Green’s function that varies, not the values of a single Green’s function. 

But why does such a function exist? Green answered this question 
this way ([116], 32): 


To convince ourselves that there does exist such a function as we 
have supposed U to be; conceive the surface to be a perfect 
conductor put in communication with the earth, and a unit of 
positive electricity to be concentrated in the point P, then the 
total potential function arising from P and from the electricity it 
will induce upon the surface, will be the required value of U. For, 
in consequence of the communication established between the 
conducting surface and the earth, the total potential function at 
this surface must be constant, and equal to that of the earth 
itself i. e. to zero (seeing that in this state they form but one 
conducting body). Taking, therefore, this total potential function 
for U, we have evidently 0 = U, 0 = V(U), and U = 1/r for 


those parts infinitely near to P. As moreover, this function has no 
other singular points within the surface, it evidently possesses 
all the properties assigned to U in the preceding proof. 


This argument is an appeal to physics pure and simple. Nonetheless, 
the introduction of a function of a particular kind that solves what was 
to become called the Dirichlet problem was to become a dominant idea 
in work on potential theory. Such functions are today called Green’s 
functions. Their use derives from what has become known as Green’s 
identity. 

Let us introduce the notation VU ofa function dx + a(x, y)dy to 


mean 
VO =(U2 0,02); 
and 


VU = Uy + Uy + Uz, 


the Laplacian of U.' 
Green began by considering VU.VV. It is a sum of three terms, and 


integrating each by parts gave him 


{ VU.VV = { Vn.VU) - { VVU, 
vol sur f vol 


where integrals , are taken over a region D in R°? and integrals { j 
VOL SUI 


are taken over the surface C of the region D, and I is the outward unit 
normal vector at a point on C. 


Likewise, 
{ VV.VU = { U(n.VV) - i: UV’V, 
vol surf vol 


but switching U and V does not change the integral on the left-hand 
side, so 


{ Vin.VU) - { VVU = { U(n.VV) - { UV°V, 
surf vol surf vol 


and this rearranges to give Green’s identity: 


{ UVv’V - { VV-U = i U(n.VV) - Va.VU). 
vol vol surf surf 


The Poisson problem asks for a function V with these properties 


e V2V = FinDand 
° Z, =O:01n€ 


for given functions F and f It reduces to the Dirichlet problem when 
COsA.. 


Green’s method transforms the Poisson problem into another that 
might be easier to solve. He looked for a function U such that 


° V2U = (0) except at one point P in D, where it is infinite like 1/r 


(and r is the radial distance from (x, y, z) to P) and 
° U=OonC. 


We plug these into Green’s identity and get 


i UV’V - { VV-U = { U(n.VV) - Vin.VU). 
vol vol surf surf 


We take these integrals in turn. 


. {,VV°U = WP) 
. ie uv°V = ie UF 
ung Va.VU) = fp fM.VU) 
fume YOAV) = 0. 

So we deduce that 


V(P) = { UF+ |] fMmvVU). 
vol sur f 


This expresses the function V in terms of the given data fand the 
function U, which is known as a Green’s function. The hope is that U can 
be found because it depends only on the shape of D. For simple shapes, 
Green’s function can often be found explicitly, thus yielding a specific 
solution of the Dirichlet problem. Likewise, theorems that establish the 
existence of a Green’s function for a large class of domains (such as 
Harnack’s theorem, see Sect. 19.3) also solves the Dirichlet problem for 
those domains, and that can be easier to do. 


18.3 Dirichlet Principle and Problem 


The young William Thomson seems to have been the first to state 
something like this principle. In a paper in Liouville’s Journal for 1847, 
he claimed that a harmonic function on a given region bounded by a 
surface could be found for which the normal derivative at every point 
on the surface took a given value F. 
He claimed that among the functions for which { r VFdS =A: 
SUL 


where A is an arbitrary constant, there will be one for which the 


integral 
av\ fav\ (avy 
Mae) (is) + Ge) Joe 


takes a minimum value, and for this function V 

OV OV OV _ 

0x2 Oy AZ” 
and the normal derivative of Vis a constant multiple of the given value 
F. The only proof he gave was that the result followed by the calculus of 
variations, which in fact only connects the double and triple integrals 
but does not guarantee the existence of a minimum. 


The Dirichlet problem specifies a simply connected domain T with a 
boundary O77, and a continuous function defined on OT. It then asks 


for a harmonic function defined on the interior of T that continuously 


extends the function defined in 07’. It is then easy to show that the 


function is unique—if it exists. Dirichlet had suggested in lectures in 
1856-1857 at Gottingen that a solution would always exist because of a 
general argument that became known as the “Dirichlet principle”. 


The Dirichlet principle, as given by Grube, states’: 


For every bounded connected domain T there are clearly 
infinitely many functions u continuous together with their first- 
order derivatives, for x, y, z which reduce to a given value on this 
surface. Among these functions there will be at least one which 
reduces the following integral 77 — Ouy\2 , (Ou\2 , (du >) 
Bintearel U = [AG + GY + GD 


extended over the domain T, to a minimum; it is evident that this 
integral has a minimum since it cannot become negative. We can 
now show the following: 

1. Every such function u which minimizes U, satisfies the 
differential equation §7y @2u 67u everywhere in the 


i * Oy? . Oz? 

domain T. This already makes it clear that there always exists a 
function u having the desired property, namely that function for 
which U becomes a minimum. 

2. Every function u which satisfies the [above, JJG] 
differential equation within the domain T, minimizes the 
integral U. 

3. The integral U can have only one minimum. It follows from 
2 and 3 that there is only one function u with the desired 


property. 


As is often remarked, the problem with these claims is that they assume 
that there is a function that minimises the integral simply on the 
grounds that the integral can never be negative. But that is to confuse 
the existence of a lower bound with the existence of a function for 
which the integral attains its lower bound. (Compare the behaviour of 


the function f(x) = 1/x on the positive real axis: it is bounded below 
by zero, but there is no value of x for which = x + ay.) 


So Thomson had shown that the Dirichlet principle is a claim in the 
calculus of variations, and indeed the Euler-Lagrange equation for the 
integral U leads to the Laplace equation. But that does not vindicate the 
principle—it merely locates it in a family of plausible but unproved 
claims. We shall now look briefly at the first attempts to prove it, and 
then at other attempts on the Dirichlet problem. 

We also have Dedekind’s statement of the Dirichlet principle, which 
is quoted in a paper by Weierstrass [271].* There Dedekind wrote that 


Given any finite surface, one can always, and in only one way, 
endow it with mass so that the potential at any point of the 
surface has an arbitrary, continuously varying, value. 


This is not very clear, but Dedekind followed with a mathematical 
interpretation: 


As a proof, we offer the following theorem. 

Given any finite connected space t, there is always one and 
only one function w that, together with its first derivatives, is 
everywhere continuous in ¢, and on the boundary of t takes 
arbitrarily prescribed, continuously varying values, and satisfies 
the equation 


0 2 

OF gO 
Ox? dy? — 02 

everywhere in ¢. 


Dedekind then noted that an exactly analogous situation holds in the 
subject of heat diffusion, where it is also intuitively evident. Then he 
went on 


We prove the theorem by drawing on pure mathematical 
evidence. It is in fact reasonable that among all functions u that, 
together with their first derivatives, are everywhere continuous 


in ¢, and take arbitrarily prescribed values on the boundary of t, 
there must be one (or more) that give the integral taken over the 
entire space tits least value. 


Dedekind first proved that such a minimiser has the required property, 
and then that it is unique. 


18.4 Riemann on Green’s Theorem 


Riemann lectured on this material in the summer semester of 1861 at 
Gottingen. His lectures were, one could say, his own reworking of what 
Dirichlet had done, and they were published posthumously in an 
edition by his former student Karl Hattendorff in 1875, under the title 
Schwere, Elektricitat und Magnetismus (Gravity, Electricity and 
Magnetism). 

In Sect. 21, Riemann showed how to construct Green’s functions to 
solve the Dirichlet problem. First, he derived Green’s theorem following 
Green’s own argument, to which he referred. Then he assumed that 
there is a function ZN = OC that satisfies Laplace’s equation inside a 


domain T bounded by a closed surface S and that takes the value t = 0 


on S, where r is the distance from a point (x, y, z) to a fixed but arbitrary 
point P’ = (x’, y’,z’) in T. He deferred proof of the existence of the 


function Do until Sect. 34. Then the function 


1 
U=U,+- 
r 


satisfies the Laplace equation in T except at the point P’, where it 
becomes infinite like 1/r, and it takes the value zero on S. 

He now surrounded the point P’ with a small sphere of radius c 
that lies entirely in T. He labelled the interior of this sphere a 1, and the 


complement of this region in T he called q;. 


Inside a; the functions U and V satisfy the conditions of Green’s 


theorem, and Riemann considered the limit as a, f, y. The integral 


{ VV7(U)dxdydz 
T 
is zero over all of T and can be ignored. It remained to consider 
~ i: UV? (V)dxdydz. 
T 


The space T being closed and bounded the integral taken over a is 
bounded for any value of c. Inside a; the volume element may be taken 


to be 7? sin Odrdédy, and because U only becomes infinite like 1/r the 


contribution of the integral over a, to the whole integral remains finite 


as (205). 


The integral over S, the outer boundary of a1, is 


where 7 is the normal derivative of U. The integral over the inner 
nN 


boundary, the small sphere of radius c, reduces to —42V(P’) as a, B, y. 


So Riemann deduced that 
OU 


do. 
Byars 


4rV(P’) = - : UV?(V)dxdydz + a V 
T s 


Riemann next looked at what happens when the function V becomes 
discontinuous in various ways, corresponding in potential theory to the 
presence of mass on a surface, on a line, or concentrated at a point. 

He also showed that the potential function with given boundary 
values is unique, and showed how to find it explicitly for regions of 


various shapes. He showed that if QO’ = (u’, v’, w’) is another point in T 
and —e* and a, are Green’s functions that become infinite at the 


points P’ and N’, respectively, then 


Q = Y(Q2,...,Qn). 


and remarked that this meant that U was a symmetric function of 
(x’,y’,z’) and —4rV(P’). 


18.5 Riemann on the Dirichlet Principle 


Riemann discussed the existence of a potential function in Sect. 34. He 
noted that Green had established it by an appeal to physics, but this left 
a gap that Gauss had filled. Gauss had argued in his ([117], Sects. 31- 
34) that given any closed surface S in three-space and a continuous 
function U on the surface, there is a mass distribution M on the surface 
(which may even be negative) such that the potential function V of this 
mass distribution differs from the given function U by a constant, and 
indeed that the mass distribution can be chosen so that (xj, y;). In 


other words, there is a distribution of mass such that the function 
V + 1 vanishes everywhere on S. 
. 


But, said Riemann, this proof was too close to potential theory, and 
a purely analytic proof was needed. That, he said, had been provided by 
Dirichlet in his lectures, and he proceeded to describe it. The claim is 
that any single-valued, finite, continuous function v on a closed surface 
S in three-space can be extended in a unique way into the interior T so 
as to remain single-valued, finite, and continuous and satisfy the 
Laplace equation 


V-v=0. 


To prove this theorem, he considered the integral over the inside of S 


OG): = { Vu.Vudxdydz, 
ig 


where u agrees with v on S and is continuously differentiable inside S. 
Evidently there are infinitely many such functions, and 
Riemann denoted one by a, and any other by V = V(x, y), where h is 


an arbitrary constant and s is a function of x, y, z that vanishes on S and 
has the same properties as u inside S. 
Then, said Riemann, the integral for Qu) depends on the function u 


but is always positive and finite. Therefore, he went on, there is a 
particular function v for which the integral (2(v) takes its least value. 


This value cannot be zero, which is a constant, for then the values on S 
must all be the same. 
Riemann now considered the function v = x — ay and deduced that 


QW + hs) = QW) + Al + h7Q(s), 


where 
| i V(v).V(s)dxdydz. 
ih 


This forces (7, 0), for otherwise when h is very small and / is negative 
one could have 
OW + hs) < QW), 
contradicting the minimality of Q(v). 
From this Riemann deduced that (7, 0) is the necessary, and also 
sufficient, condition for Q(v) to take its minimum value. Integration by 


parts then shows that this condition is the same as 


i. sV7(v) =), 
aE 


and because s is arbitrary therefore that 
V7(v) = 0. 


Riemann now deduced that the function v is continuously 
differentiable, and that it is unique. 

He then showed that there is a unique Green’s function U that also 
solves the Dirichlet problem. He considered the function 


] 
U=U,+-, 
ik 


where r denotes the distance of the point (x, y, z) from the point 
P’ = (x, y’,z’) inside S where the function U is infinite, and Do 


satisfies Laplace’s equation away from P’ and agrees with —2 on the 
j 


boundary of the domain. The function 1/r satisfies Laplace’s equation 
and so the above function is a Green’s function that vanishes on S and is 
harmonic everywhere except at the point (x’, y’, z’) in ST. The 


uniqueness of a harmonic function with given boundary conditions had 
been proved by Dirichlet himself. 


18.6 Exercises 

il 
Green’s functions are not easy to compute; find instructive 
examples on the web. 


Questions 
1. 
A chicken and egg question: Which is mathematically more 


fundamental, the existence of a Green’s function for a given domain 
or a solution of the Dirichlet problem for that domain? Or are they 
equivalent? 


Are there examples of problems in physics that lead to an 
unsolvable Dirichlet problem, or is the problem one for 
mathematicians only? 


Footnotes 


1 Green wrote everything out in full. 


2 The lectures were published posthumously in an edition by Grube in 1876. 


3 From Dirichlet ([64], 127-128), quoted in Bottazzini, The Higher Calculus, 300. 


4 Dedekind would have learned this material, as Riemann did, from Dirichlet’s lectures. 
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19.1 Introduction 


The work of Green, Gauss, and Riemann showed very clearly that the 
study of real functions of two and three variables was a rich domain 
that would be essential in the study of physics (gravitation and electro- 
magnetism), and which (in two dimensions) was a powerful tool in the 
emerging subject of complex function theory. For physicists such as 
Thomson, Helmholtz, and Maxwell, nature provided the existence and 
uniqueness theorems upon which the theory rested, but for 
mathematicians, and especially those a step or more away from 
theoretical physics, those theorems looked increasingly insecure. 


19.2 Weierstrass, Prym, and Schwarz 


Weierstrass was not persuaded by the Dirichlet principle, even for 
planar regions. In a paper he read to the Royal Academy of Sciences in 
Berlin in 1870, but which was not published until the second volume of 
his collected works in 1895, he agreed that if the Dirichlet integral 
exists and attains its minimum then the minimising function is 
harmonic and unique. But when he turned to the existence question, he 


offered what he called a simple example to show the inadmissibility of 
Dirichlet’s reasoning. He observed that 


1 2 
dp 
J= —] dx, 
[. [| ‘i 


where y(—1) = a # b = ¢(1) is always positive and can take any non- 


zero value however small, but cannot take the value zero unless an 
t 


vanishes on the interval da = 0, which is ruled out by the boundary 


conditions. 
Weierstrass’s example was the function 


a+b b-aarctan(x/s) 


PO 5 5 arctan( /e)’ 


where ¢ is arbitrarily small and positive. The graph of this function 
with ¢ = 0.01,a = —1,b = 1, in Fig. 19.1 suggests that this function is 
likely to do the trick. 


Fig. 19.1 The graph of arctan(x/0.01)/(arctan(1/0.01)), -l1 <x <1 
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The integrand is always less than ( a and ( ae and takes its maximum 
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y' (x) = 


so the integral is 


value of 2 at x = +e. 
j 


Weierstrass now argued that the integrand is positive and less than 
€/(x? + €*), So the integral is less than 


é.O =a) 
2 arctan(1/e) 


So the integral can be arbitrarily small for a suitable function y(a) 


which has a continuous first derivative, but can never be zero. So, he 
concluded, 


The Dirichlet principle leads in this case to a false result. 


It is curious that Weierstrass did not use the simpler straight line 
version, given by 
i abl fess | 
y=<nx if -—lI/n<x<l1/n 
a fo taxs a1). 


In this case, 


which becomes arbitrarily small as ¢t = y/x. It is unlikely he did not 


think of it. More likely, he rejected it because the theory of admissible 
curves that were not smooth was much under discussion at the time 
and he did not want to weaken his critique of the Dirichlet principle 
with extraneous considerations of that kind. 

Weierstrass’s argument destroys the belief that any integral that is 
bounded below attains its bounds. But it leaves open the possibility 
that the Dirichlet integral may do so, and that the Dirichlet principle 
leads to a solution of the Dirichlet problem. However, a former student 
of Riemann’s, Friedrich Prym, showed in his [228] that Riemann’s use 
of the Dirichlet principle can fail, even when Dirichlet’s problem can be 
solved. 

Prym did so by exploiting an idea of Riemann’s hitherto ignored in 
studies of the problem, namely, that a continuous function may oscillate 
wildly. Such functions may depart from their Fourier series 
representations, unlike the ones studied by Dirichlet that had only 
finitely many maxima and minima. 

Prym considered (in Sect. 8) a disc in the plane of radius 5 Mu? 


centred on the origin, and took polar coordinates n and T centred on 
the point (—R, 0), so n takes every value from 0 to 2R < | and Tt every 
value from —z to JA. On this disc he defined the function u as the real 


part of the complex function 


u+iv=iv—In(R+x+ iy) 


that was taken to satisfy —In(R+ x + iy) = —Inp — it. He now showed 


that this function is defined and continuous at the origin of the polar 
coordinates. 

The Dirichlet problem is solved, because the functions u and v are 
everywhere defined and single valued, even on the boundary of the 
disc, and the function u is harmonic because it is the real part of a 
complex function. However, as Prym then showed, Dirichlet’s integral 


(—R, O) is infinite, because the function u oscillates infinitely often in 
any neighbourhood of the point ¢ + df. 


Prym’s contribution left open the question of whether the Dirichlet 
problem could be solved. The leading figure here was Hermann 
Amandus Schwarz (Fig. 19.2), and in a series of papers around 1870 he 
was able to solve the problem in a fair degree of generality. 


Fig. 19.2 Hermann Amandus Schwarz (1843-1921) 


In his paper [244], Schwarz first solved the problem when the 
domain T is the unit disc. He considered an arbitrary function fon the 
unit circle that is finite, continuous, and real-valued everywhere (so it 
corresponds to a periodic function defined on R). He then wrote down 


the function v + dv that solves the Dirichlet problem. It is defined by 


the following equations: v = v(x, y, z, t) and 


QT _ 


1 
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Ss < 1 
— 2rcos(w — ¢) + ab, as 


A careful analysis shows that it is finite and continuous in every closed 
unit disc x = +71/2, and converges to a function that is finite and 


continuous on the whole closed unit disc. Once this is established, 
straightforward differentiation shows that the function v + dy satisfies 


the differential equation Au = 0 in the interior of the disc. 


It follows that the Dirichlet problem is solved for any domain 
conformally equivalent to the unit disc, but Riemann’s claim that this 
was true of any simply connected domain was far from understood or 
accepted (or, it should be said, precise). Indeed, one of Schwarz’s 
earliest papers [242] had been to establish the equivalence precisely 
for a disc and a square, thus giving a new twist to the famous problem 
of squaring the circle.’ (This is an early example of what was to become 
the Schwarz-Christoffel theorem at work.) So Schwarz next presented a 
method of extending the solution from domains where the problem was 
solved to domains formed by overlapping such domains in the plane. 
This gave him a large class of domains for which the Dirichlet 
problem had a solution. 


19.2.1 Schwarz’s Alternating Method 


Schwarz in his [243] considered two domains, a; and qa that overlap 
in such a way that cos kct cos Ks, the region common to both of them, is 


two-dimensional and the boundaries of the regions cross with distinct 
tangents. 
He supposed that the Dirichlet problem on a, can be solved for any 


prescribed (finite, continuous, real-valued) function a; on the 


boundary of a1, and likewise that it can be solved on a for any 
function a; on the boundary of a;. He then showed that there is aa 
solution to the Dirichlet problem for the domain yo — Aa with any 
prescribed function u on the boundary Lo U L3. This function will be 


bounded below, and so without loss of generality Schwarz assumed that 


x=(, 

He divided the boundary of a, into two parts, one outside a, that 
he called Lo, and one inside a, that he called Lo. Similarly, he called 
Lo the boundary of a, that lies outside a; and Lo the part of the 
boundary of qa, that lies inside a@,;. The region T; U 72 \ 7” he called T. 


(For a picture, see Fig. 31.1.) 

Schwarz’s idea was that an air pump (his term) could be imagined 
that pumped air from the region 7” alternately into the regions 
T, \T* and T, \ T*, through the membranes Lo and Lo. In 


mathematical terms, he supposed that the Dirichlet problem is first 
solved on a, with the boundary values u(x, y) on Lo anda constant, k 


on Lo (k will later be chosen to be the minimum value of the function u 
on the boundary Lo U L3). The solution is a harmonic function qj, say, 
inside a ,. He then chose the solution of the Dirichlet problem for the 
region qa, for a function which took the same values as the function u 
on lo and the values of a; on Lo. He then treated the region a; as he 
had just treated a, to obtain a harmonic function qd), then turned to 
a, again to obtain a harmonic function qd), and so on. 


Why does this help? The maximum and minimum values of a 
harmonic function are taken on the boundary of its domain, and the 


maximum and minimum values of the function that is uon Lo U L3 are, 
say, g and k, respectively, and define a, 6, y, 0, A, and we note that on 
Lo the maximum value of the function (—R, 0) is less than g — k. 

Next, the function (—R, 0) solves the Dirichlet problem where the 
boundary function is 0 in Lo and g = z, on Lo. So (—R, 0) is never 
negative inside a , its maximum value is less than G, and its maximum 
value on Lg is less than 6Fy, where p(x, y). A simple scaling argument 
shows that a; depends linearly on the maximum value of the boundary 
function on Lo. 

Similarly, the maximum absolute value of (—R, 0) on Lo is less than 
OF y and on Lo is less than U = O, where p(x, y). 


Continuing in this way, Schwarz obtained two sequences of 
functions, {2;_;} defined on a, and {u>;} defined on a, with the 


properties that along Lo, Uo;-) = Uz; and along Lo, uo;-; = U2;. He then 
defined two new functions 


uo = uy + (U3 — Uy) + (Us — U3) +... + (Uj — Uj_-1) +. 


u’ = Un + (U4 — Ur) + (Ug — U4) +... + (Udj42 — Uj) +... : 


These series converge unconditionally because successive terms 
diminish faster than a geometric progression with ratio q,q. (The 


geometric progression arises from the linearity observation made 
above.) The function x’ is harmonic inside a, and the function N’ is 


harmonic inside a, and they agree on the entire boundary of 7”, 
which is Lo and Lo. Therefore, the functions x’ and N’ agree on 1”, 


and so they define a harmonic function on the whole of yo — Aa that 


has the prescribed boundary values of the function u on the boundary 
Lo U L;. So the Dirichlet problem is solved for the union of the two 


regions, which is what Schwarz set out to do. 

The alternating method provides a solution to the Dirichlet 
problem for a large class of regions, including all plane domains with 
polygonal boundaries with finitely many sides. In general, the boundary 
can be made up of piecewise analytic arcs crossing transversally, but it 
is not clear what can happen in the limit, so the case of arbitrary 
boundaries, even rectifiable ones, was left unresolved. That said, 
Constantin Carathéodory praised Schwarz in Schwarz’s Festschrift 
volume ([30], 20) for separating out the interior part and the boundary 
part of the Riemann mapping theorem, and indeed Poincaré’s method 
of sweeping out (1890b) similarly made certain simplifying 
assumptions about the boundary but left the extent of the method 
unresolved. 

On the other hand, as Archibald ([1], 83) points out, Schwarz’s 
paper required familiarity with Weierstrassian methods to understand, 
and such knowledge was only available to those with access to copies of 
Weierstrass’s lectures and notes taken by the few students capable of 
doing so. Archibald, quoting ([65], 154), records the astute opinion of 
Gosta Mittag-Leffler, who was a significant figure in spreading the 
Weierstrassian model of analysis: 


The Germans themselves are not in general sufficiently familiar 
with Monsieur Weierstrass’s ideas to be able to grasp without 
difficulty an exposition made strictly on the classical model that 
the great geometer has given. Take, for example, Monsieur 

Fuchs ...he regards [Weierstrass’s] methods as thoroughly 
superior to the method of Riemann. And yet he always writes in 
the manner of Riemann. All this evil derives from the fact that M. 
Weierstrass has not published his courses. It is true that the 
Weierstrassian method is taught in several German universities, 
but everyone is not yet a pupil of Weierstrass or a pupil of one of 
his pupils. 


19.3 Harnack 


The study of the Dirichlet problem for general two-dimensional 
domains was much advanced by Axel Harnack in his book [138].* He 
began by reviewing the theories of Schwarz and Neumann, and 
observed that these authors had not fully studied the nature of the 
boundary before admitting, however, that he had not been able to 
extend their methods. Therefore, he had adopted a different approach 
using Green’s functions. He established existence theorems for 
functions with prescribed singularities, derived the general theorems in 
Riemann’s paper on Abelian functions, and showed how his ideas led to 
a proof of the Riemann mapping theorem. 

Harnack’s book was well received, and subsequent mathematicians 
often made use of what became known as Harnack’s theorem ([138], 
67). Harnack considered a sequence of harmonic functions qa, that are 


defined on a surface F and restrict to continuous functions Do on the 


boundary of F. He furthermore supposed that for every arbitrarily small 
v and for every point s of the boundary there is a finite domain partly 


bounded by a piece of the boundary of F containing s and that contains 
interior points of F and is such that the values each function a, takes 


on this domain (including its boundary) vary by less than y. As he 
remarked, it is a necessary condition that the functions Do be 


continuous. Harnack first established the lemma that if the sum 
ZN = OC converges uniformly, then the sum = x + ay converges at 


every interior point of the surface F to a harmonic function. From this 
he deduced Harnack’s theorem: 


Theorem 19.3 Ifa sequence of harmonic functions dq, all have the 
same sign (say, positive) and the sum = x + ay converges at an interior 


point of the surface F, then it converges at every interior point of the 
surface F to a harmonic function. 


Alternatively, ifa sequence of harmonic functions a; tend from below 


to the values of a function u, then u is a harmonic function. 

Informally, the sequence of harmonic functions either converges to 
a harmonic function or it fails to converge at all. 

Harnack’s book contains a number of advances in what later 
became point-set topology. He defined a domain to be connected if any 
two points in it can be joined by a finite polygonal arc that can be 
covered by overlapping discs all lying in the interior of the domain. 
Later authors refined this idea and separated the idea of connectedness 
(here, path-connectedness) from that of a domain (a region for which 
every point has a disc-like neighbourhood lying entirely in the region). 
Harnack also defined boundary points to be those points every 
neighbourhood of which contains some points belonging to the domain 
and some that do not. He then claimed that a simply connected domain 
has a continuous boundary, although the boundary may have corners 
and cusps, be nowhere differentiable and may, implicitly, need not even 
be rectifiable.° A boundary might also have arbitrary number of 
incisions, lines drawn inwards from boundary points, which would be 
traversed twice by any circuit of the boundary (Fig. 19.3). 


A 


Fig. 19.3 A circuit of the disc with the incision AB traversed AB twice—once going up and once 
coming down 


It was with this idea of a domain and its boundary in place that 
Harnack then proved that there is a Green’s function for every bounded 


region with an arbitrary boundary. He argued in three stages. First, he 
accepted Neumann’s approach establishing the existence of a unique 
harmonic function that agrees with a given continuous function on the 
boundary for polygonal regions with no re-entrant angles. Then he 
showed that if the given function on the boundary is always finite but 
has isolated jump discontinuities then there is still a harmonic function 
agreeing with the given one at points on the boundary where the given 
function is continuous. Finally, he used an approximative argument, 
which he attributed ultimately to Schwarz, to deal with the general 
simply connected domain. He considered that the domain can be 
steadily approximated by polygonal regions. He used his theorem from 
p. 67 to establish that the sequence of harmonic functions on these 
domains converged to a harmonic function on the given domain. 
Arbitrary bounded domains were then patched together out of simply 
connected pieces. To prove the Riemann mapping theorem Harnack 
used his Green’s function approach to establish the existence ofa 
suitable harmonic function and consequently of a complex function 
mapping the given bounded domain onto a circle. He also showed in 
this way that non-simply connected domains can be mapped onto a 
domain bounded by several circles (a problem Riemann had also 
investigated in work that was still unpublished). 

In the event, it was to turn out that the Dirichlet problem can be 
solved for a very large class of boundaries of a two-dimensional, disc- 
shaped region, but that the problem in three dimensions can only be 
solved for a restricted class of boundaries (without spikes, for 
example). For that reason, and because of the strong connection to 
complex function theory, I have kept the story that follows to two 
dimensions, and even then the full history of potential theory in the 
period is too rich to describe here. For a look at some of the major 
issues that were raised, and how some of them were solved, see 
Appendix D. 


19.4 Exercises 


At 
Find some domains homeomorphic to a disc whose boundaries are 


not homeomorphic to circles. 


Questions 


1. 
Try to follow through the first few stages of Schwarz’s alternating 


method when the initial values on Lo are 1 and the initial values on 
Lo are 3, noting the values assigned at each stage to Ly and Lo. 


Does it strike you as likely that the method will lead to good 
approximations to the sought-for harmonic function? 


Footnotes 


1 It helps to recall that o arctan(x/e) = €/(e? + x): 


2 Another way forward was an iterative process described by Carl Neumann, whose work will 
not be discussed here. 


3 For his proof that a square can be mapped analytically onto a circle, see a translation of his 
paper below, in Sect. 31.3. 


4 Harnack restricted his attention to this case because of the availability of conformal 
mappings. 


5 Riemann in his [234] had defined a domain as simply connected if any curve in it joining two 
boundary points divides the domain into two pieces. 
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20.1 Introduction 


The wave equation is one of the most useful in physics. Here we look at 
the dramatic story of the trans-Atlantic cable and the later introduction 
of the telegraphist’s equation. 

The best resource for the history and context of the trans-Atlantic 
cable up to the present day is surely the History of the Atlantic Cable & 
Undersea Communication at atlantic-cable.com. See, among other 
things, Bern Dibner’s The Atlantic Cable (1959). When I gave this course 
in 2017 it seemed to me that everything was here except an explanation 
of the mathematics. Since then I am pleased to say that Liam Morris, a 
student on the course that year, has posted an account of the 
mathematics. 


20.2 The Trans-Atlantic Cable 


The wave equation was at the mathematical heart of a dramatic 
nineteenth-century story: the struggle to connect Britain and America 
by a trans-Atlantic cable. 

The first working electric telegraphs were produced in the 1830s, 
and soon a network of cables crossed Europe and spread throughout 


the eastern seaboard of the United States. Information could now be 
sent reliably, and very much faster than by a man on a string of horses, 
and typically it came transcribed letter by letter into a stream of short 
and long pulses—such as the dots and dashes of Morse Code. 

However, it was not so easy to connect Britain to Continental 
Europe. It was discovered that placing a cable under water increased its 
capacitance (the ability of a body to store electric charge). The 
increased capacitance caused the signal to spread out, so that the gap 
between one item and the next had to be increased, causing the time for 
a message to be transmitted to increase. The English Channel is not 
very wide, at its narrowest it is only some 22 miles. It was much riskier 
to run a cable across the Atlantic, but it would surely be extremely 
valuable and therefore attempts had to be made. 

To understand the problem, it is necessary to consider what is 
called the telegraphist’s equation. This equation describes the current u 
at any point x in a straight wire at any time t during the transmission of 
electric signals down the wire. It is 


Oru Ou Oru 
KL We +(KR+ LS) at +RSu x2 (20.1) 
where the constants that appear in the equation involve the capacitance 
K, the self-inductance L, the resistance R, and the leakage S of the wire. 
(Self-inductance is the induction of a voltage in a current-carrying wire 
as the current changes.) 

It was written down for the first time by the German physicist 
Gustav Kirchhoff in 1857, and profoundly studied by the brilliant but 
eccentric English physicist Oliver Heaviside in 1876, but prior to them 
William Thomson had cleverly exploited a simpler equation to design a 
cable that would work across the Atlantic.! 

The equation Thomson derived for the flow of electricity down a 
wire was the heat equation. Although he did this on physical grounds, it 
was nonetheless the case that this equation and Fourier’s study of it 
were at the core of his thinking. This was not Eq. (20.1) later derived by 
Kirchhoff, but it can be obtained from it when the inductance L is 
negligible by comparison with the resistance R, so the constant KL may 
be taken to be zero, and the equation becomes the one-dimensional 
heat equation, 


Thomson, however, was not aware of self-inductance so his methods 
take no account of it. 

It is worth noting that mathematicians often seek to understand a 
complicated equation such as this one by simplifying it. For example, in 
the case at hand, if one assumes there is no resistance (x + iy) and no 


leakage (S = 0) then the equation reduces to the wave equation: 


Or = Ox?’ 


Thomson’s simpler equation implies that an instantaneous pulse sent 
down a wire of length x lasts for a time T proportional to x2 seconds 


(see below), and so two separate pulses must be transmitted T seconds 
apart in order to be received as distinct signals at the far end. But in 
1855 Thomson’s advice was ignored. 

Attempts to lay the cable were dogged by failure. The first cable, laid 
in 1857, snapped after 338 miles. A second cable, laid in 1858, 
succeeded, and a 99-word message was sent from Queen Victoria to 
President Buchanan to mark the event, but this only revealed a greater 
failure: the message took 164 hours to transmit. Misguided attempts to 


improve performance made things worse, and after a month the cable 
had to be abandoned. As the mathematician Thomas Korner put it, 
“2500 tons of cable and sink = 0 of capital lay useless on the ocean 


floor”. 


Fig. 20.1 William Thomson, later Lord Kelvin (1824-1907), Memorials of the Old College of 
Glasgow, Glasgow, 1871 


Thomson (Fig. 20.1) had opposed the original design, and was now 
placed in a position to insist on his insights being implemented. Now, 
“Half a million pounds was being staked on the correctness of the 
solution to a partial differential equation”.* 

As before, the first attempt at laying the cable, which was much 
heavier than the earlier one, failed when the cable broke, and another 
one had to be laid. But they were able to recover the ends of the broken 
cable and reconnect it, and on 8 September 1866 America and Europe 
were joined by two cables that, moreover, worked as planned. Signals 
could be transmitted at roughly eight words a minute, a decisive 
improvement on the earlier, and by then defunct, cable. Thomson was 
knighted and further rewarded with considerable amount of money, 
some of which he used to buy an ocean-going yacht—he was a keen 
sailor. 

Thomson’s telegraphist’s equation is the heat equation. There is 
only space here to describe his solution, not to prove it.* 

The problem is that of the distribution of heat in a semi-infinite one- 
dimensional rod x = 0, which may not make much sense as a problem 


about heat but makes good sense if we think of the rod as a wire. 


Define the functions _ (x—w)* 
f-W) = exp(- aKt 


Saal Then the function 


ee jane 


= (x-w)? 
f-W) ig exp(- 4Kt 


A(x, t) = { (f-(w) — f(w))dw 


A 
2VrKt 


is a solution of the heat equation 


for which 6(x, t) — 69 as t > O forall x = O and @(0, t) = 0 for all 
t=.0. 
If instead it is required that y = m(z)x + zas UW; = Uy, for all t = O, 


then the solution becomes 


‘ == [ en Cee ee 
Saas AF fone Ruz) : 


So if the question becomes what is the result of briskly heating one end 
—by a function f(t) that is zero outside of a small interval (say 
(a1, d2,...,@,))—then the above expression is the answer. 


In these circumstances, the solution is well approximated by 
CL 
A(x, t) = { f(s)P(x, Dds, 
0 


where 


P = X x 
(4,4) = mUAK pe OP 7 ; 


So if the input is an initial, short blip—a pulse concentrated in a very 
short interval—the output at a point x at time Cis given by the graph in 


Fig. 20.2. Slices for constant t show the shape of the pulse at that time; 
slices for constant x show what happens at that point as time goes by. 


Fig. 20.2 The graph of P(x,t), O<x<5, 0<t<0.08 


If u,/uy,a constant then t = y/x is a function of x, and 


O P(x, tp) 1 x? \ 2x? 
as = ——~——., exp |-—— ]|1- =}. 
ag 271/2(Kty)3/2 P Kto Kto 


The maximum value occurs at 


| Kto di V2 er 
X= —- andis —7—. 
2 Ag! /2 Kto 


So when S is small the maximum is quite large and occurs for a small 
value of x, but when S is large the maximum is quite small and occurs 


for a large value of x. This confirms what the graph suggests, that the 
signal travels down the wire getting weaker as it goes. 
If u(x, y) isa constant then P(Xxo, ft) is a function of t, and 


a = X Xp 
oe = —-— || At-2—}. 
Ot on an'/(Knys/1°° | | s) 


The maximum value occurs at 


3673/2 l 


os | 
i 5E*0 and is feo 


— 3° 
3K Xe 

So when Xo is small the maximum is reached very quickly and is quite 

large, but when xo is large the maximum is quite small and occurs for a 


large value of t. This again confirms what the graph suggests, that the 
signal travels down the wire getting weaker as it goes. 

We see that the output is a pulse that rises very quickly from zero to 
a maximum and then steadily and slowly declines to zero. As a result, 
the solution of the heat equation is immediately non-zero everywhere, 
rises to a maximum that is inversely proportional to x? for atime 


proportional to x”, and then declines. 


We can also see that when Xo is large, the pulse broadens as it 


travels by an amount proportional to the square of the length of the 
distance it has travelled. For in a P(xo, t) the first term in tis ¢-”/? and 


So very nearly zero, and in particular almost a constant, and the 
exponential term is likewise very nearly 1 and therefore also essentially 
a constant. Therefore, the variation in P(xo, f) around its maximum is 


determined by the final term in t, which is quadratic in Xo. 


Moreover, all these values depend on the value of K, so that is where 
the physics comes in: finding the materials that give the best value for 
the shape of the pulse. In particular, K should be as small as possible to 
keep the pulse sharp. 


20.3 Poincareé’s Solution 


The full telegraphist’s equation was solved for the first time by 
Poincaré in 1893.° In his short paper [217], he took the telegraphist’s 
equation in the form 


cm OV OV 
+2B— =C—. 
Ot? Ot OM 

He then supposed that the physical units were so chosen that this 
equation becomes 


A 


ev av _ eV 
Or Ot = Ax?’ 


and the velocity of the signal is that of light (and is 1 in these units). He 
then set V = Ue™ and reduced the equation (as Fourier had done in 


his study of heat) to 
UU 
Or — Ax? 


In so doing, he assumed that B* — 4AC is non-zero. 


+ U. 


He now looked for a solution corresponding to the initial conditions 
b = oo and oe = where fand @, vanish outside the interval 
T 


Z = x +1y and are polynomials in between. He obtained the solutions 


by the method of Fourier transforms, which lies outside this course, and 
was able to show that the solutions are a combination of two basic 
types, one in which the initial conditions are dx + Vdy = 0 and the 


other in which dx + Vdy = 0. 
In the first case, the solution is a certain Bessel function v(x, y) for 
U3; = Uo; and zero otherwise.° This is an interval of length 2¢. 


In the second case, the solution is a more complicated expression 
involving the same Bessel function that is non-zero outside the interval 
a —t,b + t, which is also an interval of length 2t. 


This means that what goes in as a pulse of width z = | comes out as 


an interval of increasing width. However, Poincaré was able to show 


that if z = | is very small then the solutions are 


1 
U(x) = 5f- 0; at+t<x<b+t, 


1 
U(x) = sf- 0; at+t<x<b++t, 


and f(x) = 1/x otherwise. More precisely, the other terms in the 
solution depend on z = | and are negligible if z = 1 is very small. The 


same is true of the solution to the original equation: 
VOD = One. 


However, if z = | has a finite size, then the solutions will take the 


form of a pulse with a head and then a tail of length proportional to t 
and therefore to x, the length of the wire. As he put it, if a pulse of some 
simple kind is transmitted between times ¢t = 0 and t = 0 


one sees first of all that the head of the perturbation will travel 
with a certain speed, in such a way that in front of this head the 
perturbation is zero, contrary to what happens in Fourier’s 
theory of heat and in agreement with the laws of propagation of 
light or of plane sound waves deduced from the equation of the 
vibrating string. But there is an important difference with this 
latter case, because the perturbation, as it propagates, leaves 
behind a non-zero residue .... If z = | is small ...the residue is 


negligible in front of the principal perturbation, but this is not 
the case if the perturbation lasts for a long time and if z = | is 


finite. The residue can then disturb the observations, 


Therefore, when an attempt is made to transmit a periodic wave down 
the wire, the velocity and wavelength depend on the frequency, the 
waves undergo dispersion, and the head of the disturbance moves with 
a finite speed. This is also the case with the transmission of light but not 


of heat, and the head, once it has passed, leaves behind a disturbance 
which never vanishes, unlike what happens with the wave equation. 

Poincaré was apparently unaware of a remarkable discovery that 
Heaviside had made in 1887, when he showed that the values of the 
physical constants can be so adjusted that the rate of dispersion is zero. 
This can be done both mathematically and physically, it merely requires 
that the leakage be non-zero. Far from being an inconvenience, this 
condition is necessary for the production of distortionless telephony. 
The signal becomes fainter over distances, but this can be corrected by 
fitting amplifiers. Long-distance telegraphy had dealt with distortion by 
accepting a low transmission rate, so as to separate the pulses. 
Telephony required much higher frequencies; with some leakage anda 
deliberately high self-inductance it became distortionless. Long- 
distance communication was reborn—although the money for the first 
successful patents went to the American electrical engineer Michael 
Pupin in 1901, and not to Heaviside.’ 

It is intriguing to see that Poincaré also failed to mention 
Heaviside’s ingenious discoveries in his lecture course in 1894, Cours 
sur les oscillations électriques. There he surveyed a considerable amount 
of mostly French experimental work, with a view to deciding between 
the old theory of electro-magnetism (due to Kirchhoff) and the modern 
theories of Maxwell and Hertz. The reason may have been a misplaced 
interest in the general case. Poincaré’s analysis of the telegraphist’s 
equation depended on the condition 8 —44C #0 or x = +77/2, but 


equality in these cases is exactly the condition upon which Heaviside’s 
insight depends. So the experimental work was given a theoretical twist 
and technological implications were not mentioned. 


20.3.1 Conclusion 


The principal three partial differential equations that we have 
considered, the heat equation, Laplace’s equation, and the wave 
equation, became known as the differential equations of mathematical 
physics. It is a striking fact that between them they describe so many of 
the advances in applied mathematics made in the nineteenth century 
and into the twentieth, and it is fortunate that in many cases they can 
be solved when appropriate boundary or initial conditions are 


specified, for, as the telegraphist’s equation indicates, rigorous solution 
methods for general partial differential equations are hard to find. But 

it is remarkable how much of the modern world was made possible by 

the study of the calculus of functions of several variables. 


20.4 Exercises 
Questions 


1. 
The telegraphist’s equation can also be seen as a variant of the 


wave equation (technically, in the language of a later chapter, it and 
the wave equation are both hyperbolic partial differential 
equations). What does it mean that a good—indeed, financially 
successful—understanding of it can be obtained by treating it as 
the heat equation? 


Footnotes 


1 See Thompson [256], Kirchhoff [155], Heaviside [140], and Rayleigh’s Theory of Sound, Vol. 1 
p. 466. 


2 See Korner ([164], 334). 


3 See Korner ([164], 336). 


4 For the mathematical details, see Korner ([164], Chap. 62). 


5 Poincaré had been fascinated by telegraphy as a boy and was eager to explain how it worked 
to anyone, especially family members, how it worked. He continued this interest throughout his 
life, writing on wireless telegraphy too. 


6 Bessel functions arise in the oscillations of a hanging chain, and are standard fare in applied 
mathematics. 


7 See Yavetz [276]. 
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21.1 Revision and Assessment 2 


This chapter is given over to revision and discussion of the second 
assignment, see H.3. 

I also recommended that students read some of Sergiu Klainerman’s 
essay from 2000: “PDE as a unified subject”. Of course, it is sometimes 
obscure at this stage. Many of the themes that have driven research into 
partial differential equations in the twentieth century have not been 
broached in this course or, very likely, in any undergraduate course. But 
the first 14 pages, omitting pages 5 and 6, are surprisingly intelligible, 
and in any case they are part of an answer to the traditional request 
from better students to be told something about what research 
mathematicians do. Perhaps more to the point, these 14 pages are a 
modern reflection on the themes that occupy the final part of this book, 
and will be worth students thinking about them when writing their 
final essay. 

The essay is available on the web at 

https: //web.math.princeton.edu/~seri/homepage/papers/telaviv. 
pdf 
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22.1 Introduction 


Riemann was very interested in mathematical physics. He published 
four papers on various aspects of it in his lifetime, and four more were 
published after his death. Of the papers he published, the one on the 
formation of shockwaves [236] has at least two major claims to fame. It 
is the first paper to explore the phenomenon, and it made a 
contribution to the theory of hyperbolic partial differential equations 
that is still in use today. 

At the start of this paper, Riemann remarked that just as the study of 
linear partial differential equations had been most fruitful when special 
physical problems were investigated rather than general ones, so too 
the study of non-linear problems was likely to benefit from studying 
physical problems and taking all factors into account. 


22.2 Riemann’s Paper 


Riemann’s [236]—as its title indicates—is about plane waves of finite 
amplitude. In the papers by d’Alembert, Euler, and others, and several 
later authors, only waves of infinitesimal amplitude had been 

considered. Poisson had published a long and difficult paper on waves 


of finite amplitude in 1807, and more recently the leading German 
physicist, Hermann von Helmholtz had published two more papers on 
experimental investigations of the subject. In one of them, he was the 
first person to explain the phenomenon of overtones. Then the subject 
had passed to British applied mathematicians, as Riemann noted. 

What is most interesting about Helmholtz’s paper on overtones was 
his discovery that while the superposition of sound waves in the air is 
linear when the oscillations are infinitesimal, they are not linear for 
waves of finite amplitude. Instead, overtones arise when the squared 
amplitude of the waves exerts a force comparable to that causing the 
oscillations, and so the partial differential equation describing the 
motion is necessarily non-linear. 

But dealing with such waves was the least of Riemann’s novelties. 
The paper is famous for two things: 


1. 
dealing with a non-linear second-order partial differential 


equation with discontinuous solutions, and showing that the zone 
of influence is wedge shaped. The equation is hyperbolic, and its 
characteristics represent the path in space-time of the signals. 


his methods, which proved of lasting significance in investigating 
hyperbolic partial differential equations. 


Riemann considered a compressible gas in which motion takes 
place along the x-axis. At time ¢ and position x, the density is n, the 


pressure p, and the velocity is u. The relation between pressure and 
density is given by a function 


P = 9p), 
where all that is known about y(p) is that its derivative is always 
positive: y’(p) > 0. This says, reasonably enough, that pressure 


increases with density. 
He obtained these differential equations for n and u (Sect. 1, p. 


147): 


dS = udx + vdy + Udt. 


In terms of A = log p, the first of these equations can be written as 


Vdx + dy = 0, 
and the second as 
U; + Uy = —y'(p)A,. 


To simplify these equations, he defined 
fio) = | eva 
and 
l ] 
r= 50) +u), and s = 5) =). 


The new variables r and s will be shown to be the coordinate variables 
that simplify the partial differential equation. They are also a pair of 
characteristics, and it may be helpful to look at the much simpler topic 
of Burgers’ equation (see Appendix B.2) before proceeding. 

For brevity, let us also write 


w=ur+ op), B=u- Veo). 


Riemann now deduced in a few lines that 


S; = —BS,. 


From the formula dr = r,dx + r,;dt, he deduced that 


dr = r,(dx — adt) 


and so ris constant along the curve defined by 

dx _ 

Ti. a 
and so the point with a constant value of r moves forward with velocity 
a in the direction of increasing x. Similarly, points with constant s move 


backwards with velocity aj, in the direction of decreasing x. 


As he put it 


a particular value of r, or of U = 1|/r, moves towards larger 


values of x with velocity lo'(p) + p While a particular value of 


s, or of U = 1/r, moves towards smaller values of x with velocity 


A definite value of r will gradually meet with each value of s 
lying ahead of r, and the velocity of its progress will depend at a 
given moment on the value of s with which it meets. 


A further calculation, the details of which I omit, led Riemann to 
observe that the differential 


(x — at)dr + (x — Bt)ds 


is exact, and if it is set equal to dw then wi satisfies the partial 
differential equation 


dp=0, dq=0, (22.1) 


where m is a function of —2z77. In fact, on setting e” = 2(1+ x+y), 


Ld los as 
2 do 


However, if standard hypotheses about gases are admitted (Poisson’s 

and Boyle’s law) then, Riemann showed, it is possible to reduce to the 

situation where j = —+, where a is a constant of proportionality in 
a 


Boyle’s law. This formulation depends on rand s not being constants, 
and Riemann also looked quickly at the situation where either r or s is 
constant (if ris constant then w is a function of s alone, and if s is 
constant then w is a function of r alone). 

The change of coordinates from x and t to rand s depends on rand s 
not being constant, and so, as Riemann noted, his method gives no 
information about any region in which r or s is constant. Moreover, the 
coordinate change is only valid where the Jacobian is finite and non- 
zero, and this Jacobian is 


2 VE (P)rxSx. 
The cases z, = 0 and p = K have been discussed. What is much more 
interesting is Riemann’s argument about 7 as a function of x. From its 
definition it follows that the graph of 7 as a function of x varies in time, 
and the higher values of n increase faster than the lesser values. So if 


the graph is an increasing function it evolves as time goes by into a less 
steep function. But if the graph is that of a decreasing function, it can 
evolve into the graph of a multi-valued function of x, which is absurd. 
For this to happen, Riemann showed, it is enough that x = +e. 


Riemann had now arrived at the linear second-order partial 
differential equation (22.1). He now began to single out a number of 
important features of its solutions, but first, in Sect. 4, he made some 
general observations of lasting significance. 


We treat first of all the case where the initial disturbance of 
equilibrium is restricted to a finite region defined by the 


inequalities z = x + ly. Thus outside this interval, u and n, and 


consequently r and s, are constant. The values of these quantities 
for x = 0 are denoted with suffix 1; for x = O suffix 2. The 


region in which r is variable gradually moves forward according 


to Section 1, its lower bound having velocity [o"(p)) aoe 


while the upper bound of the region, in which s is variable, 
moves backward with velocity ./g’(o,) + w,- After a time 


interval 
b-a 


VE'(P1) + V¥"(P2) + Uy — ua 


the two regions separate, and between them a gap forms in 
which N = | and r = r;, and consequently the gas particles are 


again in equilibrium. Thus from the initially disturbed location, 
two waves issue in opposite directions. In the forward wave, 
N = 1; accordingly, to a particular value n of the density is 


associated the velocity pdx + gdy = 0, and both values [i.e. of 
density and velocity, JJG] move forward with constant velocity 


Ve'(p) +u= Vo'(e) + fe) — 252. 


In the wave moving backward, on the other hand, the velocity 
—f(p) + 2r; is associated to the density n, and these two values 


move backward with velocity ./g/(9,) + f(p) — 2r,- The rate of 
propagation is greater for greater densities, because both y(p) 


and (p’(p) increase with n. 


If we think of 71 as the ordinate of a curve for the abscissa x, 


then each point of this curve moves forward parallel to the x-axis 


with constant velocity. Indeed the greater the ordinate, the 
greater the velocity will be. It is easy to see that, according to 
this law, points with greater ordinates would finally overtake 
preceding points, with smaller ordinates, so that to a given value 
of r would correspond more than one value of 7. Since this 


cannot occur in physical reality, a condition must enter that 
renders the law invalid. In fact, the derivation of the differential 
equation is based on the assumption that u and n are 


continuous functions of r having finite derivatives. However, this 
assumption ceases to hold as soon as the density curve is 

perpendicular to the x-axis at some point. From this moment on, 
a discontinuity appears in this curve, so that a greater value of n 


immediately succeeds a smaller value. This case will be 
discussed in the next section. 

The compression waves, that is, the parts of the wave in 
which the density increases in the direction of propagation, 
become ever narrower with their forward progress and finally 
become compression shocks. However, the width of the 
expansion waves grows in proportion to elapsed time. 

We may easily show, at least under the assumption of 
Poisson’s (or Boyle’s) law, that in the case when the initial 
disturbance of equilibrium is not confined to a finite region, 
compression shocks must also form in the course of the motion, 
excluding quite special cases. The velocity with which value of r 
moves forward is 


under this hypothesis. Thus larger values will, on average, move 
with greater velocity. A larger value 7’ must eventually overtake 
a preceding smaller value 7’’, unless the value of s 
corresponding to r” is, on average, smaller by 


1+k 


(r -r ae 


than the value of s simultaneously corresponding to r’. In this 
case, s becomes negatively infinite for positive infinite r, and 
thus for (x — a)”, the velocity uis —°¢ (or instead, the density, 


according to Boyle’s law, becomes infinitely small). Thus 
excluding special cases, it must always transpire that a value of r, 
larger by a finite amount, follows immediately after a smaller 
value. Consequently, since on becomes infinite, the differential 


equations lose their validity, and forward-moving compression 
shocks must occur. 


In the next sections, Riemann showed how the compression shocks 
propagate, he showed that the values of u and n on either side of the 


shock are linked. Riemann, however, failed to ensure conservation of 
energy; the relevant relations were provided in Rankine [232], Rayleigh 
[233], and Hugoniot [147, 148]. But Riemann did notice that the shocks 
must be supersonic with respect to the state in front of them and 
subsonic behind.! 

The analysis naturally depends on the initial conditions, and he 
showed that when u and 7 each have two different constant values on 


x =O and x = 0 two waves emerge from the point of discontinuity and 


each could be either a compression or a rarefaction wave. He analysed 
all four cases. 

In the final sections of the paper, Riemann did not show how to 
solve the partial differential equation (22.1) but how to transplant the 
method of Green’s functions from the elliptic to the hyperbolic case. His 
method of defining and using the adjoint of the given partial differential 
equation in order to solve the equation has since become standard. 

His method was extremely ingenious. He wished to solve a partial 
differential equation by a function that, with its first derivatives, takes 
given values on a given (non-characteristic) curve. To do this, he 
introduced a new partial differential equation (technically, this is the 
adjoint of the original partial differential equation) where he was free 
to specify its boundary conditions, provided they meet certain 
constraints. He then showed that a solution to the equation he wished 


to solve can be found if a solution of the new equation can be found. But 
that equation can be solved—althougbh it is very similar to the first one 
—because the boundary conditions can be suitably chosen. Therefore, 
the original equation can be solved. 


22.3 Darboux on Riemann’s Approach to the 


Shockwave Equation 


To complete the story, we now follow Darboux’s use of Riemann’s 
method to solve Euler’s equation (4.6); I shall consider only the case 


BF PB. 
The adjoint equation to that equation is 
fu, 6 du _B du B+R _ 
OxOy x-yOx x-ydy (x-y)/ 


On setting y = (x - y PB yp the last term disappears, so the adjoint 


equation can be treated in the form 
07z Bp’ Oz B Oz 
— i = =0 
OxOy x-ydOx x-ydy 


(22.2) 


Let us denote a solution of this equation by Z(f’, 8), so 


v= (x-yh"? ZG", 8). 
We already know that x*F(-A, p’, 1-A-£B,y/x) isa solution of Eq. (4. 
6), so switching n and £’ gives y = x1 F(A, B, 1-A-f’',y/x)asa 


solution of Eq. (22.2). 
This can be souped up by applying Mobius transformations to x and 
y into a more general solution 


v = (yo — x)"(x - x0) P “F(-A,B, 1 - A- f’,0), 


where __ (x — Xo)(y — yo). Therefore, 
x — yo)(y — Xo) 


u = (y — xP? (yp — x — 00) 8 AF (-A, 8, 1 -A-£',0) 


is a solution to the adjoint equation. 
Still following Darboux, we now seek a solution u of the adjoint 
equation that is 


oh _ (= ) 
Yo — Xo 


when L(x, y), and is (2 a } when z, = 0. When p(x, y) we have 
Yo — Xo 
r =r, the series for F reduces to 1, and the factor (x, — x) ae that 


makes u either zero or infinite unless 2 = —£’. 


Setting A = —£" we find that 


= (yo — x) (y — x) (y - x0) PF G.B', 1,0); 
and indeed if u(x, y) then _ (2 _ 


8’, and if z, = 0 then 
= 


i ( yo-x \ So u is the solution of the adjoint equation that we seek. 
YO-*0 


Finally, to solve Eq. (4.6) in the most general form one substitutes 
this value of u in either Eq. (31.32) or (31.33). For example, from (31. 
33) one deduces 


y 


x] 
Zxoyo = UZ) x,y, + { Ux,y, fi(adx + { 
Xx 


0 YO 


1 
Wi iJ? (y)dy. 


Recall that in this notation, 6; and 9; are two arbitrary functions that 
depend on the boundary conditions for z and odu is what is obtained 


by replacing x and y by a and n in the function g = bp. 


22.4 Telegraphy 
At this point in the second edition of his Lecons Darboux showed how to 
connect these ideas to the study of telegraphy, which we considered in 
Chap. 20. 

He first showed how to deduce a solution to Euler’s equation (4.6) 
in the case when £ + £’ from a solution when 6 # £’. The equation 


becomes 


Pz _ BU-B) 


Oxdy (x-y)? a 
and the solution he obtained in this fashion is 


“= (9 — x) *(y — Dy — x0) BF (6.0. 1 a | 
ce 


He replaced 1 by 6 — | in the equation, and let P(xo, t) when the 


equation becomes 
Pz 
—_—_ — hs 
Oxdy 


which he remarked “is a transform of the so-called telegraphist’s 
equation’. 

This allowed him to solve the telegraphist’s equation by what he 
called indirect methods, but he said it was better to apply Riemann’s 
methods. 

He began with the telegraphist’s equation in the form 


OF ,0°G 
We ae 


where qa and n are positive constants and (—7/2, 7/2) denotes the 


F(t), 


potential at time ¢t and a distance x. He wrote 
U =e Piley 
and chose units in which the speed of light is unity, and so the equation 
became 
Pu Ou 
Ox? Or? 


The characteristics of this equation are 


+u=0. (22.3) 


Z= 26 = px= Xo) + GQ = Yo): 
Darboux then continued Sect. 361 as follows: 


This equation is its own adjoint. If u and v are two distinct 
solutions of it, then 


_¢ OV Ou O{ Ov Ou 
eae OE) BEN OL, “ORI 


So the integral 


vanishes on any contour. 

Let us then, as in Sect. 359, form a contour partly composed 
of characteristics. Let Ox be the x-axis and Ot be the t-axis. The 
characteristics are represented by lines parallel to the 
bisectrices of the axes. If A is an arbitrary point of the plane, we 


draw through this point two characteristics that cut an arbitrary 
curve in two points 7 and A, and we take the above integral 


around the contour t > 0. 
On aj one has yo — Aa so one can exchange dx and dt; as a 


result the corresponding portion of the integral will be 


B 
{ (udv — vdu). 
A 


Similarly, the portion of the integral taken over yo;, on which 


dx = —dt, will be 
Y 
{, (udv — vdu). 
A 


If one can find a solution V of the equation that reduces to 1 on 
the segments a jx and yo,, then the sum of the two previous 


integrals will be 
2U,A — Ug — Uy, 


and one will have the equation 


Ov Ou Ov Ou 
2u, = — — —y—|dt — —-y—ld 
UA = Ug t+ Uy ac. I + [uF Fr x 


in which everything is known as soon as one knows the function 
u and one of its derivatives on the curve K. 

It remains to find the solution v that we have supposed to 
exist. The preceding arguments show us the way, and we are led 
to look for a solution of Eq. (22.3) that depends only on the 
variable 


_ (t- to)? — (x - x0)” 


g 
4 


One finds, for this solution, precisely the function J that satisfies 
the differential equation 


Vdx + dy = Ada, 


which is associated to Bessel functions. 

Let us apply our general integral to the particular case that is 
the most important in practice, where our curve K reduces to the 
x-axis, and let us denote the coordinate of A by x9. We then have 


Yf Ov Ou 
ie — — — y—|dx. 
Ua = Ug + Uy i, [us 5 x 


Suppose that we are given at the start the potential and its 
derivative. We then know that at the start 


Ou _ 
Ot 


If we recall that x — t and x — t remain constant on AB and AC, 


u= f(x), (x). 


respectively, we will have 


XQ +lo to XQ+lo dv 
2u(X0, to) = f (xo — to) + f(%o + to) + if vo(x)dx + = i) SO. 
xXo-lo 2 Xo-lo dé 
where v is equalto J(@) and vis %j-(x-x0)". 
4 


All the details of the propagation can be deduced from this 
formula. 


22.5 Exercises 


As with some of my other history courses, there comes a point where 
the historical significance of some of the conceptual developments 
under discussion is arguably obscured by strictly mathematical 
exercises, which would either be too elementary or too hard. This is the 
case from now on, so only questions will now appear at the end of each 
chapter. 


Questions 


1. 
Look back over the solution to the wave equation in the form 


(Xo, Yo). The characteristic curves are parallel to the x- and t-axes, 
and the general solution is M(du — pdx — qdy). How much 


information must be given on the x- and t-axes for the solution to 
be determined in the rectangle defined by the origin and the points 
(a, 0), (0, b), and (a, b)? 


Look over the account of Burgers’ equation in Sect. B.2 and then 
again at Riemann’s account of the formation of a shockwave. 


Footnotes 


1 Rankine cited four previous authors: Poisson, Stokes, Airy, and Earnshaw, but not Riemann; 
Riemann cited only Helmholtz; Rayleigh and Hugoniot did not mention Riemann. The first 
person to follow Riemann was E.B. Christoffel in his [41]. 
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23.1 Introduction 


Few branches of mathematics have the visual charm of the theory of 
minimal surfaces, which is one of the areas where analysis and 
differential geometry most profitably intersect. The topic was initiated 
by Euler and Lagrange but advanced only slowly until the work of 
Meusnier in the 1780s. The appropriate partial differential equation is 
difficult because it is non-linear, and it was taken up by Legendre and 
Monge, but left many secrets that were only to be unlocked in the later 
nineteenth century, chiefly by Riemann and Weierstrass.! 


23.2 Euler and Lagrange 


Many interesting problems in geometry and analysis arise when a 
function of some kind is to be minimised. A geodesic on a surface is a 
curve of shortest length joining two points, and we have already seen in 
Chap. 7 that problems in the calculus of variations can have attractive 
solutions. Informally, a minimal surface is a surface of least area 
spanning a given curve in space, and many elegant surfaces can be 
obtained by dipping a wire frame into a strong soap solution.” More 
precisely, a minimal surface is a surface with the property that any 


closed curve drawn on the surface encloses a region of smaller area 
than any other surface with that curve as boundary.’ So minimal 
surfaces are the two-dimensional analogues of geodesics on a surface. 
Unfortunately, for reasons that have to do with the difficulty of the 
mathematics and the poor grasp people had of it in the early days, the 
name is a bit of a misnomer. It emerges from the mathematical 
formulations of the theory that they should be called extremal surfaces: 
they may either be of least area, greatest area, or ambiguous in this 
respect. A precise analogy is with finding a maximum or a minimum of 
a curve with an equation &(x — iz). If you only have the ability to 


consider solutions of y’ = 0 you will find the maxima, minima, and 


horizontal inflection points—the extremal points. 

The study of minimal surfaces began in the eighteenth century with 
Euler, who had the idea that they might be interesting, Lagrange, who 
gave an analytic version of the theory in the form of a partial 
differential equation satisfied by a minimal surface given by an 
equation of the form z = f(x, y), and Meusnier, who gave a geometric 


condition that a minimal surface must satisfy. 

In Sect. V, Sects. 45-47 of his Methodus inveniendi (1744) Euler 
discussed problems involving surfaces of revolution. In Sect. 45, he 
showed how to find, among all curves passing through two points that 
enclose a given area with an axis, the curve that on rotation about that 
axis generates the solid whose surface has the least area. In Sect. 46, he 
showed how to find, among all curves of a given length and passing 
through two points, the one that generated the greatest solid on being 
rotated about an axis.“ In Sect. 47, he found among the same class of 
curves the one that generated the solid of either greatest or least 
area on being rotated about the axis. 

In each of these cases, and throughout the book, Euler began with 
the integral expressing the quantity to be maximised or minimised, 
looked at the variation of the integral, and deduced a differential 
equation that characterised the solution. In Sect. 45, the solution curve 
satisfies the equation 


(ny + b)dy 
V(1 — n*)y? — 2bny - be 


Here b is a constant of integration and n is a parameter that expresses 
the effect of the constraints; it is what has come to be called a Lagrange 
multiplier. 

Although Euler could certainly have integrated the above 
expression, he might well have found the general solution 
unilluminating, and instead he considered only the special cases where 
x = O (bis a constant of integration), x = 0, and (—R, 0) (a case we 


ax= 


shall ignore). 
When x = 0 


bdy 


and Euler wrote that “the curve will be a catenary concave to the axis”. 
In this case, the constraint does not enter the problem, and the Euler- 
Lagrange equation, as we would Say, for the problem is for the case of 
curves of any length. But Euler wrote not a word about that, and dealt 
only with the other special case, (—R, 0). 


ax= 


5 


In Sect. 47, the solution curve satisfies the equation 


cdy 


where b is an arbitrary constant determined by the length of the curve 
and c is a constant of integration, so 


ax= 


y = —b + ccosh(x/c). 
The corresponding surface is a minimal surface only when x = O,a 


condition that Euler did not mention (nor did he remark that when 
x = 0 this problem coincides with Problem 45 in the case when x = ()). 


He did, however, conclude that the answer is a catenary, the minimum 


area deriving from the case when the catenary is concave with respect 
to the axis, the maximum when it is convex. This surface, in the 
particular case when it is also a minimal surface, was later called the 
catenoid by Plateau. 

Most books about minimal surfaces credit Euler with being the first 
to find a surface of revolution that is a minimal surface, and refer to 
these examples. But it would seem that he did not, in fact, explicitly 
address the problem of finding the curve that generates the surface of 
least area on being rotated about an axis, and although its solution 
appears, it does so only as a special case about which he said nothing. 

Lagrange improved on Euler’s treatment of the calculus of 
variations in his essay (1761) and, in particular, in Appendix I of the 
work he tackled the question of minimal surfaces in this spirit. 

He wrote down that the area of a piece of surface given by an 
equation of the form z = f(x,y) and spanning a fixed boundary was 


given by ff Wdxdy, where w = ,/] + P+Ee and, as had become 


customary, p = Oz oe then argued that 


ee q = 


Phieeaa thas 
| [ Pee Ptaxay =o 
W 


This double integral equals 


co 


because, according to the general theory he had developed earlier, 
sp ag lO ON, . iO ca 
P= Vax) ~ ax 


with a similar expression involving q. Integrating by parts shows that 
the first variation of area vanishes when 


Therefore 
O(P\, 9(4q 
—(—)+—(<)=0, 7 
as li) * ay Gv) 23.2) 
and Lagrange concluded by remarking that, because of (23.2) 
pdy — qdx 
— 23.3 
"7 (23.3) 


must be an exact differential. 
As he remarked, when ZN = OC the exact differential condition is 


satisfied and the surface is a plane, but, as he said, this is a very special 
case (p. 356) “because the general solution must be such that the 
boundary of the surface can be determined at will”. He was, however, 
unable to solve any other cases of Eq. (23.2) and therefore he found no 
other minimal surface, and concluded his account by showing that the 
sphere solves the problem of finding a surface of least area enclosing a 
given volume. 

So Lagrange had shown that a surface that is the graph of a function 
z = f(x,y) (the surface is then said to be given in nonparametric or 


explicit form) and which has the least area among all surfaces with a 
given perimeter is to be found as a solution of the Euler-Lagrange 
equation for the area functional: 


O(P\, 9(4 

—|{(—]+—(—]=90, 

ax (in) * ay (W) 234) 
This equation is today called the divergence form of the minimal 
surface equation (MSE). Lagrange did not write down the 
corresponding second-order partial differential equation explicitly, 
most likely because, as we have seen, there was no theory of partial 


differential equations at the time.° It is 
(23.5) 


Cd + D ex a 2PQZxy + (1 a P es =, 


23.3 Meusnier, Monge, and Legendre 


The first mathematician to provide insight into the geometry of 
minimal surfaces was the 21-year-old Jean Baptiste Marie Charles 
Meusnier, who was briefly a student of Monge. To understand his 
contribution, consider, as Euler had done before him, the plane sections 
of a surface that pass through the normal at a given point, P, say. 
Suppose that we consistently choose a direction for these normals as P 
varies (let us call them z,,,), as we may at least for small regions of the 


surface. At each point P, the plane sections containing Z,,, cut the 


surface in curves that pass through P, and each such curve (called a 
curve of section) has a circle in the plane of section that best 
approximates it. We say that this circle has a positive radius of 
curvature if its centre, C, lies on the normal and in the positive direction 
heading away from P, otherwise negative. The signed magnitude PC, the 
radius of the circle, is called the radius of curvature of the 
corresponding curve of section. A saddle-shaped surface will have some 
curves of section with positive radii of curvature and others with 
negative radii of curvature. Euler showed that it turns out that for 
almost all surfaces (except the sphere) and for almost all points on 
these surfaces, there are precisely two curves of section that have 
extremal values for the radii of curvature among all curves of section 
with a common normal. These radii are called the radii of curvature of 
the surface at the point P (Fig. 23.1). 


Fig. 23.1 The helicoid (left) and the catenoid (right) 


Meusnier’s contribution was to realise that Lagrange’s partial 
differential equation for a minimal surface was the condition for a 
surface to have the average of its radii of curvature vanish at each point. 
This average is called the mean curvature.’ The first two minimal 
surfaces were in fact discovered by Meusnier, and they are the helicoid 
and the catenoid. 

In his major paper (1784), Monge introduced the principal curves 
through an arbitrary point on a surface given by an equation of the 
form z = f(x,y), which he defined (in Sect. 22) as the curves along 


which consecutive normals intersect, and he observed that, in general, 
there are two such curves at each point and they are the curves of 
greatest and least curvature at that point. He went on to give an 
equation for the radii of curvature R: they satisfy the equation 

e27@ o27i8 — @2ni(a+B), where (as he explained more clearly in his (1787, 


Sect. 3)) 
g=rt-s’,h=(14+q)r—2pqs+(1+ pt, and =14+p?+¢. 
Here 


Oz Oz Oz Oz 07z 


=—_— =. —— = = —<——z._ f - =; —., 
P= ax 4 Oy’ "= ax ° Oxdy’ Oy? 


For a surface of zero mean curvature, the radii of curvature vz and vz 


at a point satisfy p, — p» = i, and therefore ad — bc # Oandso x= 0. 


This is, of course, the equation for a minimal surface in the form 
(1+ q°)r —2pqs+(1+ p’)t =), 


or, more explicitly, 


07z 0? 07z 
(1+ Q)55 - 2pan a + (1 + P 


—— =), 
Oxdy . Oy? 


Monge sought to solve this equation in his long paper (1787).° In 

Sect. 23 of his paper (1787), Monge, after noting that Meusnier had 
shown that a minimal surface was also a surface of zero mean 
curvature, proceeded to try to solve the partial differential equation. He 
wrote down the equation for the characteristic curves: 


(1 + q°)dy” + 2pqdxdy + (1+ p*)dx° = 0. (23.6) 
Factorising equation (23.6) then led to the equation for a curve given by 
dy + tdx = 0, 


along which C is constant, and another curve, given by 


dx + Vdy = 0. 
along which T is constant, where 
—pqtivl+p?+¢? —pq-ivl+p?+¢ 
a a . (23.7) 


Each of these led to expressions for x and y in terms of p and q from 
which Monge obtained expressions for x, y, and z as integrals of 
functions involving C and T. As can be seen, the characteristic curves 


are complex, not real, and this raises a number of problems. Monge’s 
hope must have been that in the end a real surface could be obtained. 

In Sect. XV of the Applications Monge repeated the analysis of the 
principal curves on a surface that he had given in his (1784), and in 
Sect. XX he turned to the study of surfaces of zero mean curvature. He 
deduced the equation for a surface of zero mean curvature from his 
equation for the radii of curvature and noted that this was the equation 
that Lagrange had shown defined surfaces of least area (but this time 
made no mention of Meusnier’s name—Meusnier had died in the battle 
of Mainz in 1793). 

Once again Monge proceeded to deduce properties of the minimal 
surface along the characteristic curves. He also presented, as he had 
before, a second characteristic equation: 


(1+ g°)dp” — 2pqdpdq + (1+ p’)dq’ =O; (23.8) 


which had been mysterious in his paper but had by now been more 
properly rederived by Legendre. After some algebra, here suppressed, 
he was able to come up with the result that a solution to the minimal 
surface equation was of the form 


x = —O’(a) + 'P’(B) and y = —®(a) + aP’(a) + YB) — BY’ (6), 


where z is found by differentiating these equations to obtain dx and dy 
and using dz = pdx + qdy. The resulting equation can be integrated 


and the result is that 
Z= [ore V-1 -—a*da + [vw —1 - BdB. 


For Monge and his contemporaries, the problem with this formula was 
the apparently ineradicable appearance of imaginary quantities. As we 
have seen, mathematicians in the eighteenth century had no problem 
using formal complex methods in real geometrical problems, but the 
imaginary quantities were required to cancel at the last stage so that 
the solution could be purely real. Faced with a result where apparently 
this could not be done Monge did not proceed any further with analysis 


and his solution did not enable him to find any more examples of 
minimal surfaces. 

The apparently intractable nature of the solution remained, as 
Poisson was to note in his (1832), 


Monge integrated [the minimal surface equation] ina finite 
form, but by considerations that did not seem to be admissible 
and involved him in long discussions with Laplace. 

Legendre then obtained the same integral, by means of a 
transformation applicable to a class of second-order equations, 
which could not leave any doubt as to the exactness of the result. 
Unfortunately one cannot deduce anything from this integral, 
which is complicated by imaginary quantities .... 


23.4 Riemann and Weierstrass 


The topic of minimal surfaces in the nineteenth century is one of the 
success stories for complex function theory, which itself is major new 
development in the period. It also illustrates the power of the theory of 
linear ordinary differential equations, which makes it worthwhile 
considering it here, and it was a rich field for differential geometers in 
the decades after Gauss. 

Gauss had shown the value of studying a surface embedded in R? 


by looking at the unit normals at each point (and coherently specifying 
an outward or positive direction). These define a map from the surface 
to the unit sphere by imagining each normal is moved parallel to itself 
until it is based at the centre of the unit sphere, and then associating to 
each point P on the surface the tip of unit normal, which is a point on 
the sphere. After Gauss’s death, this map became called the Gauss map. 
The first to study the Gauss map of a minimal surface was Ferdinand 
Minding in 1850. He showed that it enabled one to transplant the 
curves of longitude and latitude on the sphere to curves which 
Gauss had earlier shown were highly convenient in the study of 
surfaces. Gauss had suggested that a coordinate grid could be imposed 
on a surface by choosing a curve (usually a geodesic), drawing all the 
geodesics that meet this geodesic at right angles, and then drawing all 


the curves orthogonal to those geodesics. If this is done on the sphere 
starting with the equator, one first obtains the meridians or longitudes, 
and then the parallels or latitudes. Accordingly, on any surface the 
curves in the first family, the geodesics, are called meridians, and the 
curves in the second the parallels. Minding showed that for a minimal 
surface the inverse of the Gauss map maps meridians to meridians and 
parallels to parallels. This meant, in particular, that orthogonal curves 
were mapped to orthogonal curves. 

Minding was soon followed by Ossian Bonnet, who considered a 
different grid on a surface, defined by taking at each point the two 
curves through that point whose radii of curvature are the extremal 
values. These curves are called the lines of curvature on the surface. 
Bonnet showed that the lines of curvature on a minimal surface are 
mapped by the Gauss map onto curves on the sphere that map by 
stereographic projection to two families of orthogonal straight lines. It 
followed, by Gauss’s paper of 1825 on conformal mappings, that the 
Gauss map of a minimal surface was conformal. Bonnet was also able to 
show in his [20] that the coordinates (or at least the z-coordinate) of a 
map defining a minimal surface are harmonic. 

From conformal maps and harmonic maps to complex analytic maps 
is in hindsight but a small step, but the examples of the authors just 
discussed show that it may not have seemed that way in 1860. Indeed, 
it was taken for the first time only by the two leading complex analysts 
of their day, Riemann and Weierstrass, independently. Riemann’s 
account was entrusted by him to Hattendorff for editing in April 1866, 
but apparently dates from 1860 to 1861. The original manuscript 
consists purely of formulae, and Hattendorff supplied a text; the result 
was published in 1867. Weierstrass’s account was first given ina 
lecture at the Berlin Academy in 1866. Weierstrass was also the first to 
give a general account of algebraic minimal surfaces. 

Riemann’s approach was to define a piece of surface by a map from 
a patch of R? in (p, q)-coordinates into R°? in the usual (x, y, z)- 


coordinates, then to map the surface onto the unit sphere by the Gauss 
map. The area of an infinitesimal piece of the surface in R? is related to 


the corresponding area on the sphere by the Jacobian of the Gauss map. 


In this way, the area of the entire surface is known as a double integral, 
and the condition that the surface be a minimal surface is that the first 
variation of that integral vanishes. 

This condition reduced to the statement that a certain differential 
form was exact, and from that exact differential form Riemann deduced 
that it was possible to impose isothermal coordinates on the surface in 
such a way that its Gauss map became complex analytic. It followed that 
the minimal surface was represented conformally on the sphere and 
the plane, and Riemann went on to show that mean curvature at each 
point was zero. 

From this insight, Riemann was able to obtain formulae that 
parametrised the minimal surface: 


du (1 
2 = he i (4) [n- -) dtog | (23.9) 
du \ 1 


du \ 
= Re| -2i —]} dl : 
Zz | i{ (aoes| van| (23.11) 


These equations are equivalent to those known today as the 
Weierstrass—Enneper equations for the coordinates on a minimal 
surface, in the form Weierstrass was to give that involves only one 
function, u = u(7) (see Eqs. (23.12), (23.13), and (23.14)). However, 


and 


Riemann did not pause even to notice that he could now write down 
infinitely many examples of minimal surfaces in terms of a function 


u = u(n). 


Thus, a minimal surface is obtained every time one has an analytic 
function. The deeper question that then arose was to ask for the 
minimal surface that spans a given curve in space; this is the so-called 
Plateau problem.’ 


Riemann tackled this problem by means of a detailed study of the 
behaviour of the Gauss map, but even he found this task daunting, 
however, and gave explicit solutions only for simple boundaries: two 
skew lines in space (the helicoid); two intersecting lines; and a third 
lying in a plane parallel to the first two, three skew lines (which led to a 
generalisation of the Riemann P-function); the regular space 
quadrilateral (later studied by Schwarz), and two circles in two parallel 
planes. 

Weierstrass’s approach was different.'° He started with the 
expression for the mean curvature, and assumed that the given piece of 
surface can be defined by a conformal map from a patch U of R? with 


coordinates p and q into R? with coordinates x, y, z. Typically, he gave 


no references, saying only that it was well known that one could do this; 
Gauss had indicated in his (1828) that this can always be done. He then 
showed that x, y, and z must be harmonic functions of p and q, and so 
the real parts of three complex functions fg, andh of u = p+iqg, 


meromorphic on the whole of U. The conformality condition on the 
map implies, in terms of f, g, and h, that 


Weierstrass shrewdly saw that this implies that there are functions 
G(u) and H(u) such that 


fu =G?-H’, 9’(u) =i(G? +H’), h’(w) = 2GH. 
He then introduced the complex variable s defined by _ H(u), anda 
~ G(u) 


technical argument, which I omit, finally led Weierstrass to the complex 
function G2 ae which he called 0 < s < I, upon which he based his 
u 


ds 
analysis. He gave explicit power series expressions for the coordinate 


functions of a minimal surface in terms of S, and also for its Gaussian 
curvature. He pointed out that the principle of analytic continuation 


then allowed the coordinate functions to be defined for the entire 
minimal surface. So he could finally proclaim that to every single- 
valued analytic function there corresponds a surface with mean 
curvature everywhere zero. 

The parameterisation that does this is known today as the 
Weierstrass—Enneper equations and it can be given in various forms. 

After a little work, the Weierstrass—Enneper representation can be 
given in the form 


i. 


W 


x=c,+ Re | (1 —- a) R(w) da (23.12) 
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y=co+Re {i oF ow) R(w) da (23.13) 
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Here, x, y, and z are the real parts of f(u), g(u), and h(u). 

Weierstrass gave no examples; however, for suitable choices of R 
various well-known minimal surfaces are obtained. For example, 
setting K = | leads to Enneper’s minimal surface, and R (w) =a i207 


to the catenoid. 
The alternative expressions in terms of the functions G(u) and H(u) 
are 


X= Xo + Re i. (Guy = Huy’) du , (23.15) 
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y=yo+Re { i(G(u) + H(u)*) du , (23.16) 


0 


z=z +Re [{ 2GwHw) du . eet) 


0 


The connection between these formulae and Riemann’s is given by 
= 2, 
~ G(u) 


Because Weierstrass’s accounts were published and taken up by his 
students before Riemann’s posthumous account appeared, most 
subsequent authors credited Weierstrass with discovering the intimate 
connection between minimal surfaces and complex function theory. 

As we saw in Chap. 19, Schwarz was the student of Weierstrass who 
remained most closely associated with the master. This is particularly 
true of his earliest work, as one would naturally suppose. In 1865, 
Weierstrass gave a seminar at the University of Berlin on minimal 
surfaces. Perhaps as part of the seminar, Schwarz took up and solved 
the problem of finding the minimal surface bounded by a space 
quadrilateral, unaware that this problem had already been solved by 
Riemann. Schwarz’s mathematical solution was presented to the Berlin 
Academy by Kummer, accompanied by a Gypsum model that Schwarz, it 
seems, had made himself, based on experiments with glycerine. On the 
strictly mathematical side, Schwarz’s analysis made considerable use of 
the theories of elliptic and hyperelliptic functions, because it chose a 
particular hyperelliptic function for substitution into the Enneper- 
Weierstrass equations. This gave his surface a natural periodicity: 
pieces of it could be fitted together to form an annular region with two 
boundaries, and these pieces could in turn be joined up to form an 
infinitely extended surface with infinite topological genus. Altogether a 
remarkable discovery with which to embark on a career in 
mathematics (Fig. 23.2). 


Fig. 23.2 A piece of Schwarz’s periodic minimal surface, S. Schwarz, Gesammelte 
mathematische Abhandlungen, vol. I, facing p. 132 


A deep insight into analytic functions underpinned this work. This 
was the recognition that if a straight line lies in a minimal surface then 


the surface has that line as a line of symmetry. In other words, and to 
give the highly plausible physical motivation, if two pieces of minimal 
surface meet along a common line and have the same normals there 
then each is the analytic continuation of the other. This is the origin of 
the Schwarz reflection principle, which Schwarz went on to prove in 
1869. Interestingly, Schwarz wrote that he had learned this insight from 
a conversation with Weierstrass. 


23.5 Simple Solutions of the Plateau Problem 


As noted above, the Plateau problem asks for a minimal surface that 
spans a given curve in space. In general, given a piece of a minimal 
surface bounded by a curve in space, one has no knowledge of the 
behaviour of the Gauss map on the boundary curve. But if the boundary 
curve contains a straight line then along that line the image of the Gauss 
map can only be an arc of a circle on the image sphere (although any 
part of it may be covered more than once). For this reason, attempts on 
the Plateau problem were confined to polygonal curves in space, 
because the image of the boundary under the Gauss map is made up of 
arcs of great circles on the sphere. 

The polygon is specified by giving the coordinates of its n vertices. 
So the problem is a three-dimensional version of the Schwarz- 
Christoffel problem, which was under investigation at the same time. 

Accordingly, the solution to the Plateau problem might then be 
expected to come in two parts. The minimal surface is realised by a 
map from the upper half line that maps the real axis to the polygonal 
boundary curve. In the first part, we take arbitrary points 2), Z2,...,Z, 


on the real axis and look for a holomorphic map with branch points at 
these points of the correct order, so that the map will send the upper 
half-plane onto a polygon with the required angles at each vertex, but 
not necessarily a polygon of the right size and shape. In the second part, 
we fix the positions of the points Z), Z2,...,Z, on the real axis so that 


the map is onto the given polygon. The side lengths are given by certain 
integrals, and the task is to use that information to determine the right 
values of those points on the real axis. 


In a note (the “Fortsetzung”) added to his first paper on minimal 
surfaces, Weierstrass approached the problem through the theory of 
linear differential equations, and tried to define the equation that the 
functions G(u) and H(u) must satisfy.’ It is trivial that two functions 
satisfy a second-order linear differential equation, the important 
question is what can be said about the coefficients. Weierstrass claimed 
that the equation satisfied by the functions G and H has rational 
coefficients that depend on the lengths and directions of the boundary 
segments. However, “In general”, he said, 


the determination of the constants in this differential equation, 
such as the constants of integration, depend on the solution of 
transcendental equations; but it is not difficult to find special 
cases where one can find complete expressions for G(u) and 
H(u) in terms of known functions. 


Weierstrass concluded by promising a full report at a later date, but this 
was never given. All we have is the “Bestimmung”, which is Schwarz’s 
account written in the 1890s of what Weierstrass had in mind when 
giving that two-page report to the Academy of Sciences in December 
1866. 

As Schwarz reconstructed it, Weierstrass’s approach had been to 
consider what had to happen at each vertex of the polygon. The polygon 
has n angles at the corresponding vertices of a, Q@2,..., @,. Moreover, 


each angle defines a plane whose orientation is captured by the normal 
to the angle at the vertex. Schwarz claimed that by working in this way 
Weierstrass had found some important facts about this differential 
equation. In particular, he had deduced that at each branch point the 
equation—regarded as an equation for a function y(t)—has a simple 
pole for the coefficient of a and a double pole for the y-coefficient. 


These are the only singular points, and the point at infinity must be 
what is called an inessential singular point for the differential 
equation.!° 

This argument fails to determine the required differential equation 
completely, and as a result it seems to have been abandoned. First, as 


Weierstrass had indicated, it is necessary to show how the pre-images 
of the vertices on the boundary are determined, but that problem was 
not discussed. Second, it says nothing about the important role played 
by the branch point of the Gauss map in the interior of the minimal 
surface. But these are very difficult problems. Indeed, as late as 1914 
Darboux remarked that “Thus far, mathematical analysis has not been 
able to envisage any general method which would permit us to begin 
the study of this beautiful question”. 


23.6 Exercises 
Questions 


1. 
Find examples on the web of some of the best-known minimal 


surfaces: the helicoid, the catenoid, and Enneper’s surface. 


On the (correct) assumption that any closed curve spans a minimal 
surface, find examples of minimal surfaces that are topologically 
Mobius bands. 


Footnotes 


1 For more detail on all of this material, see the forthcoming book by Gray and Micallef on Jesse 
Douglas, minimal surfaces, and the first Fields Medal. 


2 Asa quick check on Google images will confirm. An hour or two with homemade wire 
contours in various shapes, such as a trefoil not, or two circles (unlinked and linked) will be 
highly instructive. 


3 The curve must be a non-self-intersecting and bound a region of the surface. 


4 A facsimile of the relevant pages of Problems 45 and 46 will be found in Nitsche ([207], 6). 


5 Acatenary, with respect to suitable axes, is given by the equation y = cosh x. 


6 Lagrange did write it in his (1806, 489), albeit in an ad hoc notation. 


7 The term was introduced by Sophie Germain in her (1831). 


8 The paper was submitted in 1784, but only published in 1787. 


9 Itis named for the Belgian physicist Joseph Plateau who, although blind, showed in a series 
of experiments that the surface of a liquid is in equilibrium if its mean curvature is constant. 
Weightless films, floating in a different liquid, will therefore have zero mean curvature and 
locally satisfy Lagrange’s equation. 


10 See Weierstrass [267, 268], and the paper [270], which was published for the first time 
posthumously in his Mathematische Werke, vol. 3. 


11 See the MIT account, http://ocw.mit.edu/courses/mathematics/18-994-seminar-in- 
geometry-fall-2004/lecture-notes/chapter18.pdf. 


12 Lazarus Fuchs was developing the theory of these equations under his influence at the time. 


13 An inessential singular point is one where the solutions of the differential equation remain 
holomorphic in a neighbourhood of the point even though at least one of the coefficients of the 
differential equation is singular. 
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24.1 Introduction 


Several issues arise here. Does mechanics start with equations of 
motion, or with principles like the principle of least action? Is the 
Lagrangian formulation, although completely general, always the best, 
or can there be others? Is there a worthwhile parallel between 
mechanics and geometry? 

This is a tough chapter, so let me spell out the route through it, and 
say what is essential. First, I present a clean, modern version of the key 
mathematical ideas.' Given a mechanical problem expressed in terms of 
a Lagrangian (see Sect. 7.6) in generalised coordinates vy; and vj, it can 


be represented in terms of more symmetric coordinates v; and p; and 


a new function called the Hamiltonian, which is closely related to the 
Lagrangian (Eq. (24.1)). Important here is what are called Hamilton’s 
equations (Eq. (24.2)). 

Hamilton had the brilliant idea of looking at what happens to the 
time evolution of a Hamiltonian system. Now the upper end point of the 
Hamiltonian (or Lagrangian) is allowed to be a function of time, and the 


paths of the qs continually obey their Euler-Lagrange equations. This 
gave him a function W of the upper end points and the time, and it 
satisfies a particular first-order partial differential equation (see 
Eq. (24.3)) that became known as the Hamilton-Jacobi equation after 
Jacobi saw more deeply into it. 

If we suppose that the values z = Ax + By. define a point 


(a1, 2,...,Qn) ina space we can call Q, then it turns out as time goes 


by in a Hamiltonian system these points define a moving hypersurface. 

Moreover, solutions of the Hamilton-Jacobi equation satisfy one half 
of Hamilton’s equations, thus making a connection between a first- 
order partial differential equation and a family of first-order ordinary 
differential equations. Indeed, the q’s as they evolve describe the 
characteristic curves of the partial differential equation. This is what 
Cauchy appreciated (see Chap. 12), and it opened up, surprisingly, a 
way of solving some systems of ordinary differential equations by 
solving a partial differential equation. 

This being a history course, after presenting these ideas I document 
that Hamilton did possess them in his own way, that Jacobi re- 
interpreted them—and that he had scathing views about much of what 
had been said about variational principles before him—and that the 
ordinary differential equation—partial differential equation 
connection just mentioned was soon appreciated. 


24.2 Hamiltonian Dynamics 


Let us suppose we have a dynamical problem expressed in terms of a 
Lagrangian. I shall write €(x — iz) for the Lagrangian, where q stands 


for an n-tuple in variables z = Ax + By. and n for the corresponding n- 
tuple z = Ax + By.—we can get a long way by thinking of x = 0.1 shall 


write A rather than vy for the variational symbol, to avoid confusing v 


and S, asymbol to be introduced shortly. I shall write 7° for the initial 
time and 7° for the final time. 


As we shall see, William Rowan Hamilton defined a function, which 
he called the principal function, as the time integral of the Lagrangian: 
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t 
a L(q, q, t)dt. 
79 


A standard variational argument shows that 
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1 
AS = i AL(q, q, t)dt 
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Note that this is a system of n equations, one for each q;, j = 1,...,n. 
We integrate the second term in the integral by parts and obtain 


t! 
OL dob 
AS = | [—-——)agat 
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De 
If the end points are fixed then the term in square brackets vanishes. So 


the variation of the integral vanishes if the integrand vanishes for all 
Aq, and the result is the Euler-Lagrange equations: 


OL adOoOL 
— —- ——, j=l,...,n 
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Following Hamilton, we define 
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di 1" 8g; 
We now define the Hamiltonian 
H = H(q, p,t) = De PjQj =u. (24.1) 
J 
Therefore 
: OL OH 
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but L is not a function of the p; and so oh —( and so 
Py 


OL 
Pra aq; 
These equations 
; OH _ OH 
pj = “Oh and qj = ap (24.2) 


for M —iN =7 are called Hamilton’s equations or the canonical 


equations. The solution to these equations resolves the dynamical 
problem at stake. 

They have a simple, if perhaps surprising corollary that turns out to 
be useful. We calculate oe 


dH | >e dH dq; a OH dp; 
t 440q; dt Op; i 


But by the canonical equations, the RHS equals a sum of terms of the 
form 


dq; | .dpj dH 


Pan Ta a’ 
so the summations cancel, and it follows that 
dH 0H 
dt of 
In other words, if the Hamiltonian H does not involve t explicitly—so 
OH __,——then H is constant— dH __,; In conservative systems, the 
Ot dt 


Hamiltonian can be treated as the energy. 

Now we turn to a different approach, one that often turns out to 
have complementary virtues. Hamilton had the ingenious idea of 
investigating what happened to the path given by the Euler- 

Lagrange equations as the upper end point varied and as the time taken 
varied. He may well have done this because he was interested in the 
passage of light through crystals. So in what follows the q(t) that 


define the path satisfy the Euler-Lagrange equations. 
He now regarded his principal function 


ti 
s={ L(q, q, t)dt 
0 


as a function of its upper endpoint p = z, and the value of q varies. The 


initial values of the coordinates x? are fixed. We have dS _ 1 Suppose 


dt! 


that we fix a value of ¢! = 7. Then we can define a function 


y=x'7F(y-a,y-B,y,%), 


The distinction between these two functions is that S is defined on any 
path but Wis a function of the end points and the time the system has 


been evolving. 
A variational argument now gives that 
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but now we deduce that 


where the term on the right is evaluated at p + g. But 
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Now we allow T to vary, and calculate OW. We have 
OT 


dW _ dW | 0Wdq! WV. Le 
dT OT Odg'dT OT 
We have dS _7 so, using Eq. (24.1), 
dt! — 
O 


W 
— =-H(q',p',T). 
aT (QP 31) 


But the upper end points and the time are the only variables involved, 
so we can drop the labels and write 
OW | OW 


W=W(q,t), —=p, —  =—H(a,p,b). 
(q, t) ag Bs. (q, p,t) 


The last equation can be written in this form 


cae? | alee 24.3 
ie or (223) 


In this form, it is called the Hamilton-Jacobi equation. It has the 
property that W does not appear explicitly, so if Wis a solution then so 
is cos jx, where c is an arbitrary constant. 


It remains to show that the solutions of this equation solve half of 
Hamilton’s equations, i.e. that 


on the assumption that we have OH __.—these are taken as part of 
Ot 

the initial conditions. (At each point when tf = O there is a unique curve 

that obeys the Euler-Lagrange equations and the condition just 

assumed. As we shall see in Chap. 25, this is closely analogous so the 

existence of a geodesic on a manifold through a given point in a given 

direction, and this in turn derives from the fact that the geodesic 


equation is a second-order equation.) 
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The secondterm onthe RHSis 0 , and differentiation here 
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of p with respect to qis required because _ OW,so we get 
= 3y 
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So we finally obtain 
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as required. 

So now, if we can solve the Hamilton-Jacobi equation we can solve 
the canonical equations—and if we can solve the canonical equations 
we can solve the Hamilton-Jacobi equation. 

If we now suppose further that H does not depend explicitly on t 
then we may write Eq. (24.3) in the form 


H ae 
Tag)" Or” 


for which the general solution will be of the form 
IY) — YOo)| = Bole 


With a little more work (but you might think there’s been enough) and 
on the further assumption that H is the total energy, and therefore a 
constant of the motion, it can be shown that we can therefore think of 
the surfaces A = (Xo, 0) as wave fronts moving in Q, and their progress 


analysed in some detail. 

We shall investigate the solutions of the Hamilton-Jacobi equation 
in Sect. 24.4, but we now turn to look at Hamilton’s original paper and 
at Jacobi’s comments on it and the theory out of which it grew. His 
comments are enjoyably fierce. 


24.3 Hamilton’s and Jacobi’s Theories of 
Dynamics 


I first note an annoying sign convention: what we write as U, the 
potential energy of a system of particles, was denoted adt in the 


nineteenth century. To minimise confusion, I shall sometimes write 
U = —U to denote potential energy with the old sign convention. So 
where we write x, x|,...X, for the Lagrangian, these mathematicians 
wrote [, = T + U; in each case, T denotes the kinetic energy of the 
system. 

Hamilton took up the topic of dynamics, the motion of a system of 
particles under their mutual attraction, in 1834, when he wrote his 
essay “On a general method in dynamics”, and he wrote a “Second 


essay” on the subject a year later. We shall concentrate on it.” 
In this essay, he worked from the start in generalised coordinates v, 


, and introduced the new coordinates a= me Here T is expressed 


with respect to the vj and 1); as a function of v; and aj, the same 


function is denoted F. Hamilton wrote down the expressions for varying 
T and varying F and deduced that 


OF-U) OF  . OF OF 
Oa; - 0a; ~ ii On; On; 
because the potential function (7 does not depend on ax. So 


Lagrange’s equations take the form 
dw; 7 O(U — F) 
dt 7 On; 


Hamilton set H = F — U and obtained the equations 


dn; OH dw; oH 
dt lor dt 2 On; 


These later became called the canonical equations. This is the first 
appearance of the canonical equations, although important steps in that 
direction had been taken by Poisson in a paper Hamilton cited; 
Hamilton’s improvement was the introduction of the a jx. 


Hamilton then reintroduced the principal function S as 


Lt 
S = | (r+ O)ar, 


which says that the principal function S is the time integral of the 
Lagrangian. A calculation of the variation of S enabled Hamilton to 
show that 

Os 

Ht +H=0 
(or rather, a set of equations equivalent to that result). 

At this point, Fraser and Nakane make several valuable 
observations. They note ([108], 184) the new derivation did not 
assume the conservation of mechanical energy, a fact that Hamilton did 
not notice—he repeatedly stated, indeed, that his method was 
restricted to cases where that law applied. 

Moreover, as it happens, the variational conditions often imposed by 
Lagrange are the vanishing of the “action” and fixed end points for the 
curves under consideration. This can have the effect that the only curve 
considered is the minimiser itself—there are no other candidates! 
Hamilton allowed the end points to vary and so produced a genuine 
family of candidate curves. 

Finally, they remark that Hamilton also noticed that if the variation 
of S vanishes then 

t t m4 
6S = p (Fon) -{ » a Le = én, | dt. 
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If the end points are fixed then the first sum becomes zero and the 
other term is simply Lagrange’s equations. 


24.3.1 Jacobi 


Carl Gustav Jacob Jacobi must have read Hamilton’s papers in 1836, 
because he wrote to his brother Moritz about them in September 1836 
to say that they had led him to make a deep study of dynamics.° 
Although Jacobi had a genuine interest in mechanics, and had recently 
made an important breakthrough in the three-body problem, his 
reworking of Hamilton’s ideas was much more that of an analyst. 

He established carefully that the principal function S is a function of 
the variables k = Nz/€, the initial positions Il(n) = n! and t. Then he 


wrote down the expression for its variation, and deduced that 


ss (C8) 8) 
+3 OX; OY; Oz; 
an equation he called Hamilton’s equation.* 

A feature of Jacobi’s derivation was that it made clear that the forces 
could depend explicitly on the time. In principle, this extends the study 
of dynamics to non-conservative systems in which energy is lost, but 
Jacobi lost interest in this in the 1840s, and it was for his successors to 
appreciate this advance on the work of Hamilton. 

Less clearly, and we cannot discuss this point fully here, Jacobi’s 
method is not wholly variational because he did not discuss the 
variation of the end points. This is not a problem for Jacobi’s 
presentation, because the derivation of “Hamilton’s equation” is 
entirely a piece of partial differential equation theory. 

Jacobi had various criticisms of Hamilton’s work that we cannot go 
into here and for which the reader is referred to Fraser and Nakane 
[108]. It says a lot about the growing place for mathematics in 
Germany (and, of course, France) and its uncertain place in Great 
Britain even with a mathematician of Hamilton’s talents, that there is 
much justice in Nakane and Fraser’s concluding remarks (p. 220): 


Hamilton was the great creator, and it is unimaginable that 
Jacobi could have reached the level of remarkable abstract 
insight that he did without a foundation already in place. 

Jacobi nevertheless had a better knowledge of contemporary 
analysis and a better sense for how the new ideas should be 
developed at an appropriate theoretical level within the calculus 
of variations, mathematical dynamics and differential equations. 
He possessed as well a talent for making the new ideas 
accessible to receptive mathematicians. Although he died some 
fifteen years earlier than Hamilton, his posthumous Vorlesungen 
would become the most influential work in the history of 
mathematical dynamics since Lagrange. 


With this in mind, it seems worthwhile to give some quotations from 
Jacobi’s posthumously published lectures on dynamics, starting with an 
extract from the sixth lecture.° 


We now come to a new principle which, unlike earlier ones, does 
not give an integral. This is the “principle of least action” 
incorrectly called that of least work. Its importance lies first in 
the form in which it presents the differential equations of 
motion, and second in that it gives a function which is minimized 
when these differential equations are satisfied. Indeed, sucha 
minimum does exist in all examples, but the reason for this is 
unknown. Whereas the interest of this principle consists 
precisely in the fact that one can generally construct a minimum, 
formerly too much importance was attributed to the existence of 
such a minimum. An example of the principle under 
consideration comes from Euler’s “de motu projectorum”. After 
..proving the principle for attraction to fixed centers, he did not 
succeed in extending it to the n-body problem, for which he did 
not know the principle of kinetic energy; he contented himself 
with stating that the computations were very lengthy. But 

Euler said that the principle of least work had to be valid also 
here, since the fundamental results of a sound metaphysics 
revealed that forces in nature always do the least work. 


However, neither a sound nor any metaphysics shows this, 
and indeed Euler was led to this expression only through 
misunderstanding of the name “least work.” Maupertuis meant 
that nature achieved her work with the least expenditure of 
force, and this is the true meaning of the “principle of least 
action.” 

In my opinion, this principle is presented incomprehensibly 
in all textbooks, even in the best, those by Poisson, Lagrange, 
and Laplace. Namely, it is stated that the integral i > myvjdsz 


(where |, — dsj is the velocity of the point m/j,) is minimized, 
J dt 


when taken from one position of the system to another. To be 
sure, this is only stated to be valid for conservative systems, but 
it is forgotten that one must eliminate time from the above 
integral and reduce everything to space elements. Moreover, this 
integral must be understood to be a minimum for given initial 
and final configurations and all possible paths joining them. 


In his eighth lecture, Jacobi obtained the Lagrangian form of the 
equations of dynamics, and wrote 


In place of the Principle of Least Work, one can substitute 
another principle, which also consists in the vanishing of a first 
variation, and which can be derived from the differential 
equations of motion even more simply than from the Principle of 
Least Work. This variational principle seems to have been 
unnoticed previously, because - in contrast to the Principle of 
Least Work - it does not correspond to a minimum principle. 
Hamilton was the first to use this principle as a point of 
departure. We will use it to set down the equations of motion in 
the form given by Lagrange in his Mécanique analytique. 

Now let k = 27/€ be the partial derivatives of a function [7 , 


and let T be half the living force [kinetic energy], that is, 


“1 al dxj\" | (dy;\" , (dzi\ 
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then the new principle is 
5 | (r+ Odr=0 (1) 


This principle is equivalent to the Principle of Least Work, but 
more general in that t may enter (7 explicitly, which is excluded 


in the previous Principle [of Least Work], because in the latter 
time must be eliminated using the theory of living force 
[conservation of energy], and this holds only when t does not 
enter (J explicitly. We will use equation (1) to derive the 


differential equations of motion from a first-order partial 
differential equation. As Hamilton showed, one can decompose 
the variation in (1), using integration by parts, into two parts, of 
which one is outside and the other inside the integral sign, and 
both must vanish. In this way, the integrand, which equals zero, 
gives the differential equations of the problem, and the 
expression outside the integral sign gives its integral equations. 
The complete statement of the new principle reads as 
follows: Let the configurations of the system be given at a 
specified initial time S and final time S.Then the actual 


intervening motion is determined by the equation 


6 {(T + U)dt = 0 of (1). 
Here the integral is taken from S to S; (J is the force 


function and can contain the time explicitly, and T is half the 
living force. 


Jacobi then derived the equations 


dpj_ OT+U). 08 
dt ag; FG; 
In his ninth lecture, Jacobi then obtained the Hamiltonian form of the 


equations of dynamics. He first obtained the equations in a form 
Poisson had presented them in 1820: 


OB OUT Ec 
dt 0g; Oq; Oq; 


in which oe depends only on the v; and ae is ahomogeneous 
qj qj 


quadratic function of the v; and therefore of the p;. This yields the 
equations 
dp; dq; 
ap, Bee 
dt dt 
about which he remarked 


This is Poisson’s form of the equations of motion, where the 0Z 
and Lp contain no variables other than the ps and the qs. This 


system of 2k equations has the following noteworthy properties: 
0Q; _ OO 0Q; _ OP, OP; _ OP; 2) 
Op, Op; Oqe = pj Ogu gy 


Of these, Poisson (loc. cit.) writes down those of the first group, 
while the rest can be written down directly from his results. 
Equations (2) show that the quantities 6Z and Lo are to be 


recognized as the partial derivatives of a single function of the 
pj; and 7/2. This observation, which comes directly from 


equations (2), Poisson does not make; still less does he try to use 
this function. It was Hamilton who first expressed it, and 
through the introduction of his characteristic function the whole 


reformulation is made extraordinarily easier. One can reach the 
same conclusion almost by one’s self, if one derives the kinetic 
energy theorem from the second Lagrange form of the 
differential equations given in formula (9) [of the Eighth 
Lecture]. 


24.4 First-Order Partial Differential Equation 
Theory 


None of this would be much use unless the actual equations could be 
solved, and many authors comment explicitly on the utility of the 
Hamilton-Jacobi equation. After all, one’s experience is that partial 
differential equations are usually hard to solve, whereas ordinary 
differential equations are usually easier. So, given a first-order partial 
differential equation, one looks for the system of characteristic 
equations, which are ordinary differential equations, and hopes to solve 
them. But, as Hilbert and Courant commented (Vol. 2, 107) 
Hamilton and Jacobi achieved a major success by recognizing 
that this relationship may be reversed. To be sure, the 
integration of a partial differential equation is usually 
considered as a problem more difficult than that of a system of 
ordinary differential equations. In mathematical physics one is 
often led, however, to a system of ordinary differential equations 
in canonical form. These equations may be difficult to integrate 
by elementary methods, while the corresponding partial 
differential equation is manageable; in particular, it may happen 
that a complete integral is easily obtained, e.g., with the help of 
the separation of variables (cf. Ch. I §3). Knowing the complete 
integral, one can then solve the corresponding system of 
characteristic ordinary differential equations by processes of 
differentiation and elimination. This fact, which is contained in 
the earlier results of §4 and §8, can be formulated in a 
particularly simple way for the case of canonical differential 
equations and can be verified analytically, independently of the 
motivation [...]. 


Now, the Hamilton-Jacobi equation is a first-order partial differential 
equation inn variables gj,...@, and we expect to find a system of n 


first-order ordinary differential equations for it which are the 
characteristic curves. In fact, these equations are precisely the 
equations q; = Hy, that we took on trust earlier. So we see a very 


strong connection between the theory of first-order partial differential 
equations and the Hamilton-Jacobi theory of dynamics. Curiously, it is 
not clear how much of Cauchy’s theory was known to either of these 
men at the time. 

A comment of the historian Tom Hawkins provides a helpful 
conclusion to this difficult mathematics°: 


For Hamilton the importance of the equivalence lay in the 
direction of replacing the equations of motion by the two partial 
differential equations so that thereby the difficulty of 
determining the motion of a system of masses “is at least 
transferred from the integration of many equations of one class 
to the integration of two of another” [1834]. Jacobi realized that, 
at least in terms of the integration theory of first-order partial 
differential equations, which Hamilton does not appear to have 
had in mind, “reduction” [...] is hardly an advance. As 

Jacobi explained in a letter to the secretary of the mathematics 
and physics section of the Berlin Academy: “Little would seem to 
be gained by this reduction to a partial differential equation 
since according to Pfaff’s method ...- and for more than three 
variables till now nothing further was known about the 
integration of partial differential equations of the first order - 
the integration of the one partial differential equation to which 
the dynamical problem is reduced is much more difficult than 
integration of the directly given system of ordinary differential 
equations of motion.’ He went on to explain, however, that “if 
Hamilton’s investigations are extended to all first-order partial 
differential equations, as can be done without difficulty, it is on 
the other hand a significant discovery in the theory of first-order 
partial differential equations that they can always be reduced to 
a single system of ordinary differential equations, which 


previously according to the Pfaffian method was insufficient” 
(1837a: 50-51). 

Here Jacobi was referring to his discovery that the problem 
of determining a complete solution to the general first-order 
partial differential equation 
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reduces to the complete integration of a single system of 
ordinary differential equations [...] of order 2” — 1, which is in 


fact the first system that arises in the application of Pfaff’s 
method (1837b: 101-102). 


24.5 Exercises 
Questions 


ih 
An abundance of daily experience suggests that Euclidean 


geometry is true, and adequately axiomatised. What would have to 
be done—a question Jacobi asked when giving his courses on 
mechanics—to establish an axiomatic account of everyday 
mechanics? 


Footnotes 


1 This one follows www.damtp.cam.ac.uk/user/tong/dynamics.htm, which is David Tong’s 
Cambridge notes. 


2 This account closely follows [108] which can profitably be consulted for many interesting 
insights. They look in some detail at the first essay. 


3 See Koenigsberger ([162], 198), quoted in Hawkins ([139], 205) and Pulte ([230]). 


4 The same equation appears in [135]. Jacobi did not call S the principal function, however. 


5 They were published in 1866. Compare the extract in the Birkhoff Source Book, 374-379, 
which gives some of the technical material as well. 


6 See Hawkins ([139], 206). 
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25.1 Introduction 


One of the most important developments in the application of the 
calculus of variations was the exploration of the close analogy between 
Hamiltonian dynamics and Gaussian differential geometry, and it is to 
this that we now turn. These connections have been carefully explored 
by the historian Jesper Ltitzen in his [193], and we follow his account 
here.! The chain of ideas is as follows: 


¢ Gauss introduced the idea of geodesics on a surface and coordinate 
systems largely made up of geodesics (compare latitude and 
longitude on a sphere). 

¢ Liouville mimicked these arguments in the context of mechanics but 
without imposing a geometrical interpretation on mechanics. 

e Lipschitz interpreted a mechanical trajectory as a geodesic on a 
surface (or higher dimensional analogue) and thinks of mechanics in 
geometric terms 

e Darboux wrote all this up carefully and lucidly: mechanics can be 
studied geometrically. 


25.2 Gaussian Curvature 


In his Disquisitiones circa superficies curvas (or, General investigations of 
curved surfaces) [116] that created the subject of intrinsic differential 
geometry, Gauss introduced the idea of a surface as either the image of 
a map from a patch of R? to R? or a domain with coordinates (p, q) and 


a metric 
ds’ = Edp’ + 2Fdpdq + Gdq’, 


where E, F, and G are functions of p and q. 

He then imposed a coherent logic on what seemed at times to be a 
sprawl of formulae related by arbitrary changes of variable. In 
particular, he defined a concept of curvature of a surface and showed 
that it was intrinsic to the surface. That is, it could be defined entirely 
without reference to any ambient space in which the surface lived (such 
as, Say, anormal to the surface implies). As Gauss puts it, if one surface 
can be mapped isometrically onto another then the values of the 
curvature agree at corresponding points. This was an entirely novel and 
unexpected idea (Gauss called the corresponding theorem the 
exceptional theorem, Theorema egregium), and it gradually 
transformed the subject of differential geometry. 

Gauss defined his measure of curvature by means of the Gauss map 
(see Sect. 23.4). He defined the (Gaussian) curvature at the point P as 
the limit 


area of S’ 
im; 
SP areaof S 
where S is a region about the point P and w’ is its image under the 


Gauss map. If the surface S is a plane then all the normals point in the 
same direction and the Gauss map sends the entire plane to a point: the 
plane has Gaussian curvature zero. The Gauss map sends a circular 
cylinder to a line, so the curvature of a cylinder is also zero. The 
Gaussian curvature of a sphere of radius R is easy to find: directions on 
the domain sphere are scaled by a factor of 1/R by the Gauss map, so 


the Gaussian curvature is | /R?. Lastly, saddle-shaped regions have 


negative curvature, as can be seen from the figure of the catenoid, 
Fig. 23.1. 

Gauss showed that the value of the (Gaussian) curvature at a point 
was always the product of the extremal radii of curvature at that point. 
So, writing K for the Gaussian curvature and s = @ for the radii of 


curvature at a point, one has 0 < @ < 1. In particular, if the surface is a 


minimal surface, its mean curvature vanishes and the radii of curvature 
at a point are C, (they will vary with P) then one has K = —k*.Soa 


minimal surface has negative Gaussian curvature. 

Gauss’s reformulation of differential geometry spread slowly across 
the mathematical community, and because its deepest discovery 
concerned the existence of intrinsic properties of surfaces applications 
of it came only slowly to the study of minimal surfaces, which belong in 
extrinsic geometry. 

To study geodesics on such a surface, Gauss noted that one can 
impose an analogue of polar coordinates on a surface, by choosing an 
arbitrary point O as origin, an arbitrary geodesic as the base line, and 
assigning the coordinates (r, ~) to the point that is a distance r along 


the geodesic that meets the base line at the angle y. He then proved 
that the curve defined by the points a given distance y; from O is 


everywhere at right angles to the geodesics through O. In a system of 
polar coordinates, E = 1, cos A., and G must be positive so that the 


metric is positive definite. 
He then investigated how to change a given coordinate system to a 
system of polar coordinates, and obtained the equations 


y 
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from which r can be determined, and 
(25.2) 
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from which, when ris known, ¢ can be found. He observed that these 


equations can certainly be solved, and indicated that power series 
solutions would be particularly interesting. In this way, geodesics are 
found that are the orthogonal trajectories to the curves r = const. 


25.2.1 Liouville’s Contributions 


Gauss’s ideas about differential geometry were brought to France by 
Liouville and Bonnet. Liouville was a particularly versatile 
mathematician and an editor of a journal he had founded, which put 
him in a good position to know what was going one generally, and one 
of his contributions was to find an important way in which differential 
geometry connected to the study of partial differential equations. 

In the 1840s, one of Liouville’s concerns was to acquaint his fellow 
French mathematicians with the work of Gauss on differential 
geometry.” One of the ways he did this was by re-issuing Monge’s 
Application d’analyse a la géométrie, to which he added a series of 
notes, remarking that the notes 


deal with those points ...for which Mr. Gauss has opened new 
ways; besides our aim is to indicate to young people the sources 
where they can find information rather than giving them regular 
lectures. 


One result rather sketchily proved by Gauss—although one can 
legitimately wonder about Liouville’s argument, as we shall see—was 
that a surface given in the form (x(u, v), y(u, v), z(u, v)) and with a 
metric of the form ds? = Edu? + 2Fdudv + Gdv’ can be given a new 


coordinate system q@ and n for which the metric takes the form 


ds* = A(a, B)(da? + dB’). 


Such a system of coordinates is called isothermal, and its advantage 
is not only that it is easier to calculate with but that the coordinate 
curves u = u(x, y) and u = p + ig meet everywhere at right angles. 


This condition on two families of curves occurs naturally in differential 
geometry, for example, the principal curves on a surface meet at right 
angles.° 

To prove this claim, Liouville formally factorised the metric asa 
product 


[du VE dv-L(F +i VEG — a) [du VE + aver —iVEG-— P)). 
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Liouville now supposed that there was an integrating factor S = O that 
makes the first factor an exact differential, which he wrote as d(a + i6). 
Multiplying the second factor by the integrating factor S = 0 makes it 


exact and equal to d(a + i6), and multiplying the two factors yields 


(u? + v7)ds* = da’ + dp’, 
which Liouville wrote as 
ds* = A(a,B)(da’ + dB’), Aa,B)=(Wt+Vvy". 


The weak point in this argument is the existence of an integrating 
factor, which had been proved in the real case by an argument that does 
not extend to the complex case that Liouville was dealing with without 
more thought than he gave it. In fact, that point had been dealt with by 
Cauchy in 1819, but his result was not well known and it can seem that 
even Cauchy had forgotten it; in any case, Liouville did not mention it. 
Gauss’s great discovery had been the intrinsic nature of the 
curvature K of a surface. Liouville offered what he believed was a 
simpler proof of this result, by showing that K satisfies a partial 
differential equation in terms of J, which is an intrinsic quantity: 
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It would take us too far afield to give his proof, but we can observe the 
uses of this result. For example, Liouville showed that a surface of zero 
curvature can be mapped isometrically onto a plane by solving the 
equation 


A loga F Ploga _ 
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As for surfaces of non-zero curvature, those of constant curvature stand 
out as the first to analyse. If the curvature of such a surface is, say, —4 
JT 


then J satisfies the partial differential equation 
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an equation that has become known as Liouville’s equation. It is more 
often written today in the form 


Oz : Oz (2 
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where a,[, y. 
Liouville tackled this equation by introducing complex coordinates 
a+igpB=u, @-iP =v, 
which enabled him to write the equation as 
F loga pif 
Oudv ~ 2a 


In Note 4 of the re-edited book by Monge, Liouville merely stated the 
solution he had found, offering as the complete integral depending on 
two arbitrary functions y(a) and q(t) the expression 


0. 
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Further work by Liouville enabled him to obtain as a surface of 
constant negative curvature the curve obtained by rotating a tractrix 
about its axis—the surface later known as the pseudosphere. 


25.3 Geometrising Mechanics 


In a paper [187], Liouville observed that the kinetic energy ofa 
conservative system is 


> mv =2(U + K), 


where K is a constant. From the definition ) = a, he obtained 


ds2 
dt = mds’, 
2(U + K) 
and so the action integrand can be written as the square root of 
m 
2(h+ U) ») qndajdae. 
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where @, 6, y, 0, A, the sum of the kinetic energy T and the (negative of 


what we would call the) potential energy U is a constant (the total 
energy) and 


m n 
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Liouville now observed that because the quadratic differential form is 
positive definite it can be written as a sum of squares in a new system 
of coordinates, so 


3 q jkdg jdqx = sy i, 
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where 


Ps) 


i = P kk: 
k=1 


We can think of the array (P jx) as a matrix, and the array (P/‘), which 


we Shall shortly meet, as its inverse. Those ideas had not yet fully come 
into mathematics, and Liouville had to write everything out in full. 
He then wrote 


n 


dé = > ili. 
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where F(-—, —, —, | — x) is a function yet to be determined. It follows 


that 
n 
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k 
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Liouville now demanded that y satisfies the first-order partial 


differential equation 
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A little more work (here omitted) allowed Liouville to write the action 
integral as 


1/2 
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It is clear that the integral takes a minimum value when 
DY px(njle - nylj)° = (), which occurs when 
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and along this trajectory A = @(1) — @(0). 


Thus, as Ltitzen ([193], 33) points out, Liouville rederived the 
Hamilton-Jacobi formalism in an elegant way. But it seems that he did 
not draw any analogy with the differential geometry that we are 
tempted to read in to his work, and most likely did not appreciate it. 

The posthumous publication of Riemann’s Habilitation lecture on 
geometry in 1867 slowly provoked mathematicians to investigate 
differential geometry in n dimensions. Importantly, Beltrami’s two 
papers [5, 6] showed how to rigorously define intrinsic non-Euclidean 
geometry in any number of dimensions. In his [188], Rudolf 
Lipschitz showed that Hamiltonian mechanics was possible in such a 
setting, thus keeping alive the idea that these new geometries are 
candidates for physical space.* Lipschitz had already written on 
mechanics with an eye to geometry in his [188] and in a later French 
summary of his work ([189], 297-298) he wrote° 


One of the principal aims of the research that we shall analyze 
here was the profound study of Gauss’s measure of curvature. In 
addition to the approaches which have hitherto led to this goal 
one may chose an approach which consists in presenting, in a 
general way, all the fundamental concepts related to the 
curvature of a surface and then deduce from them the concept of 
the measure of curvature. With this in mind, the idea is to find a 
definition of the radius of curvature which will lend itself to a 
natural extension. Here the principles admitted in ordinary 
mechanics leads to the following theorem: When a material 


point which is not influenced by any accelerating forces is bound 
to move ona given surface, the pressure exerted in each point of 
the trajectory is inversely proportional to the radius of curvature 
of this trajectory. Accordingly, one may define the reciprocal of 
the radius of curvature as a quantity which is directly 
proportional to the resulting pressure of this motion. 


The measure of curvature at issue here is what is called geodesic 
curvature. Recall that a parameterised curve in space has at each point 
a tangent vector, a principal normal that measures the rate of change of 
the tangent vector, and a binormal (with which we shall not be 
concerned). Just as the tangent vector captures the velocity of a point 
moving on the curve, the normal vector captures its acceleration. Now 
suppose that the curve lies on a surface. At each point, the magnitude of 
the component of the normal to the curve that lies in the tangent plane 
to the surface is the geodesic curvature. It has this name because when 
that component vanishes the normal to the curve and the normal to the 
surface coincide and there is no acceleration in the tangent plane. 
Therefore, in terms of the intrinsic geometry of the surface, the curve is 
as Straight as it can be—the definition of a geodesic. 


25.4 The Connection to Hamilton-Jacobi 
Theory 


The geometrical interpretation of the solution of the Hamilton-Jacobi 
equation is interesting and illuminating. It turns out that one can 
suppose that the S-surface is moving in Q space along curves that can 
be regarded as geodesics in Q space with a metric determined by the 
variational principle of the dynamical problem. 

We shall suppose that the points (g),...,¢,) occupy a simply 


connected domain Q. We now write the interval z = z(x, y), and 
consider the space x — iz. We can regard this space as being made up of 


hypersurfaces F(x, y, z) = 0 for each fixed value of t € J. As we have 


seen, problems in dynamics often throw up a different family of 
hypersurfaces, given by the equations 


DS (GivsessGnst) =O 
where C is a constant. The idea here is that through each point 
(d1,--+s9n; to) there passes a curve that leads in x — iz to the 
hypersurface (g1,..., Gn; to) we will see how to specify the right 


curve shortly, it will be an extremal corresponding to a variational 
principle. We assume that these curves never cross, and so we can 
suppose that the hypersurface (q),..., Gn, to) flows along them in the 


direction of the hypersurface (g1,...,9n, 0). However, we do not 


assume that the flow is at the same speed on each of the designated 
curves; as the example of geometrical optics suggests, commonly one 
imagines light is travelling along these curves in a way that depends on 
the medium through which it is passing. 

Or, which is not very different, one supposes that at each point P; 


of the hypersurface (g1,...,@Gn,to) there isa metricon x—izanda 
geodesic in x — iz that joins P; toa unique point P; on the 
hypersurface (g1,..., Qn, to). Now one considers the surfaces that are 
defined as the points a fixed distance C from the initial hypersurface 


(G1,-+++9n; to). 


In his [188], Lipschitz generalised Hamilton’s principle and deduced 
conservation of energy in the new setting. Then he noted that in the 
special case where x = 0 and Uisa constant his formulation of the 


Hamilton-Jacobi equation is Gauss’s equation (25.1). This suggested to 
Lipschitz that in any number of variables the Hamilton-Jacobi equation 
can be considered as a transformation of a metric, and the best case is 
when the equations of motion (the Euler-Lagrange equations) in the 


new coordinates z,, = P(x, y) are solved by equations of the form 
yj = const. 


He was led to establish this theorem (here and below f(dq) is 
shorthand for the quadratic form in the transformed system): 


Let P(q1,..-,9Gn,1,---;@,) be a complete solution of the 
Hamilton-Jacobi equation. Fix the values of gj,...g, and 


consider the family of trajectories of the mechanical system that 
are orthogonal to the | — ap dimensional manifold P = A with 


respect to the form f(dq). Then the trajectories are determined 
by the equations 

OP [OP 
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where (op \ is the value of we at the intersection point. 
Oaj 0 qj 
Moreover, any other | — ap dimensional manifold P = A will 


cut the trajectories orthogonally with respect to the form f(dq), 
and the action integral along the trajectories between the two 
1 — ap dimensional manifolds P = A and P = A willall have 


the value B-A. 
He also established the converse result®: 


Let P(q1,.--;Gn) = A denote an | — ap-manifold. Consider the 


family of trajectories of the mechanical system cutting this 
manifold orthogonally with respect to f(dq). On each trajectory 
and on the same side of the | — ap-manifold determine a point 


such that the action integral V between the g = z,)-manifold 


and this point is equal to B — A. Then these points make up an 


1 — ap-manifold which is orthogonal to all the trajectories with 


respect to f(dq). Moreover if the action integral V along a 
trajectory from its intersection with P = A to an arbitrary point 
is considered a function of this latter point, then u = u(x, y)isa 


solution of the Hamilton-Jacobi equation and the | — ap- 


manifold RK = A will coincide with the original 1 — ap-manifold 


PSA: 


Thus, Lipschitz’s work made clear the close analogy between the 
geometric study of geodesics in a manifold and the dynamical study of 
trajectories in Hamiltonian mechanics. To be sure, he spoke of a 
quadratic form, not a metric. But he knew Riemann’s work very well, 
and Beltrami’s, and he referred to Gauss’s work on geodesics. 

It seems that he was unaware of the earlier work of Liouville, and in 
turn his work was unknown to Thomson and Tait, who had stated 
Lipschitz’s result (for a single particle subject to a force) in vol. 1, p. 353 
of their book. But other authors did read Lipschitz’s paper, among them 
the French geometer Gaston Darboux. 

Darboux makes an interesting contrast with Felix Klein. Both men 
saw geometry as the natural way to formulate and solve problems, both 
were energetic writers of books as well as research articles. 

Darboux inclined, as a well-educated French mathematician, to the 
study of differential equations and differential geometry; Klein, as a 
well-educated German mathematician, to projective geometry, with an 
original interest in groups. 

In Volume 2 of his Lecons sur la Théorie Générale des Surfaces [58], 
Darboux developed the study of curves on surfaces, the ideas of 
curvature and torsion, and geodesics. Then he turned (in Book 5, 
Chaps. 6-8) to what he saw as the close analogy between Gauss’s 
theory of geodesics and Jacobi’s theory of analytical mechanics. In this 
spirit, he mentioned the work of Thomson and Tait, introduced 
Hamilton’s principle, and took his readers through Lagrangian and 
Hamiltonian dynamics in the manner of Liouville and Lipschitz. With 
his lucid exposition the theories of classical mechanics and differential 
geometry were united. 


Indeed, as Darboux puts it (§569) the work of Liouville and 
Lipschitz 


establishes the principle of least action without the use of the 
calculus of variations, and by methods that are entirely 
algebraic. 


Lutzen has drawn attention to a particularly attractive point in 
Darboux’s account (Chap. VIII, §§571-577). Darboux formally 
eliminated the time variable from the action integral by noting that 
conservation of energy implies that 


i i 
are 4/——, 
U+h 


where 


Therefore, Lagrange’s equations can be written in the form (in 
Darboux’s disturbing notation) 


O O 
d— V(U + h)T -—wvJ(U+h)T = 0. 
Odqj 0g; 


But this is what is found by saying that the first variation of the integral 
I 1 
1 
i VU +hT = 5 i) VU + A)T Yanda jda 
ik 


vanishes. 
Darboux therefore set 


ds? =U +h) ajdqjdqx. 
Lk 


He called ds the elementary action, and remarked that the general 
problem of mechanics is reduced to the study of the extrema of the 


integral a dy: But 


This is what the principle of least action consists of; and one 
sees immediately, thanks to this principle, that the general 
problem of mechanics is only an extension to any number of 
variables of the problem of studying geodesic curves. 


(All that is missing is the fully Riemannian idea that any number of 
variables together with a metric define a geometric space.) 

The final detail was contributed by Heinrich Hertz. His expertise 
was in physics, and he fulfilled his initial promise by being the first 
person to confirm Maxwell’s prediction that electro-magnetic waves 
travel at the speed of light and that light is itself an electro-magnetic 
wave. The book in which he set out his most fundamental ideas about 
mechanics, his Die Prinzipien der Mechanik in neuen Zusammenhangen 
dargestellt, was published posthumously in the year of his death, 1894. 

Hertz had a dislike of the concept of force—there was in fact a long 
tradition of mathematicians and physicists for whom the word covered 
up a lack of understanding and needed to be replaced. He also disliked 
the concept of energy, and his own way of doing without these concepts 
was to re-interpret the Lagrangian equations of dynamics in terms of 
some new, “hidden” masses. This seems to have convinced no one, but 
on the way he came up with a thorough-going geometrical reading of 
dynamics that is close to that of Lipschitz and Darboux, but which went 
one step further by calling the quadratic form a metric. Surprisingly, it 
seems from the definitive study of Hertz’s mechanics that Hertz came to 
these ideas initially unaware of this earlier work ([194], 160); however, 
much of his later acquaintance with them may have affected his 
exposition. 

It was now possible for mathematicians to say explicitly what 
Darboux had said at length but without the concept of a general space 
up front: Trajectories in Hamiltonian dynamics evolve in phase space 
along trajectories that are geodesics with respect to the metric that is 
the quadratic form in the energy functional. 


25.5 Exercises 
[In defiance of my earlier proscription!] 


ie 


Given a straight line (a geodesic) in the plane and a family of 
geodesics at right angles to it and of equal length, what is the curve 
formed by the end points of these geodesics? 


The same question, but for geodesics on the sphere. 


The same question, but for geodesics in the disc with a metric of 
constant negative curvature (the non-Euclidean or hyperbolic disc). 


Questions 


1. 


The extrema of the integral of a Lagrangian with fixed times for the 
lower and upper end points determine the curves along which a 
dynamical system evolves. By studying what happens when the 
upper end point or time is allowed to vary, Hamilton studied how 
those trajectories evolve. The time-dependent Lagrangian satisfies 
a partial differential equation with respect to which these 
trajectories are the characteristic curves. These curves can be seen 
as geodesics with respect to metric determined by the Lagrangian. 
If you know about general relativity, this is the bridge between 
Einstein’s approach and Hilbert’s; see what you can find out about 
their competitive rivalry in 1915, starting with Corry [48]. 


Footnotes 
1 See also Ltitzen ([192], Chaps. XVI, XVII). 


2 This account follows [192], the definitive biography of Liouville, see pp. 739-747. 


3 Away, that is, from what are called umbilic points. 


4 Earlier Schering, Riemann’s successor in Gottingen, had written his [240] on potential 
theory in non-Euclidean geometry. 


5 Quoted in Lutzen ([193], 36). 


6 This and the earlier result are quoted in Litzen ([193], 43-44). 
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26.1 Introduction 


Lagrange’s theory of the calculus of variations was successful, 
influential, and, just like the early calculus itself, hard to explain. As it 
was Steadily improved, first by Legendre and then by Jacobi, it also 
became clear that it reflected an eighteenth century naivety about the 
nature of functions and had not properly considered the range of 
possibilities for candidate curves. In particular, functions were 
generally taken to be infinitely differentiable. The best nineteenth 
century theory for this was presented by Adolf Kneser towards the end 
of the century, building on earlier ideas of Weierstrass.' The subject 
also forms the last of the famous Hilbert Problems. 


26.2 After Lagrange 


Lagrange’s elegant enrichment of Euler’s insights created a workable 
calculus of variations, but it rested on a mysterious theory, and as 
Lagrange’s personal confidence in an algebraic foundation for the 
calculus as a whole found few who shared it, it was natural for others to 
investigate and extend the theory more carefully. 


It was clear, for example, that the solutions found were generally 
either maxima or minima, and that this issue could usually be decided 
from the context, but it would be better to have a way of deciding it on 
theoretical grounds. Legendre initiated such an investigation in a paper 
of 1788, when he studied what is called the second variation of the 
integral. (The name derives from the analogy with the calculus of 
functions a, y,z, f:ifata given point dy _ 0 then the sign of q”y can 


dx ax? 


determine if the point is a local maximum or minimum of y(x)). 
Legendre took the integral 


b 
[= { f(x, y,y dx 


and assumed that the function y is an extremal for the integral J. He 
then considered a function w(x) which varied y(x), writing u = x — iz, 


where w(a) = 0 = w(b). The first variation of / is 


laf Of 
i= —w+—w’'|d 
I le oad x 


and the second variation of J is 


The total variation in / is given by 
| 
ANS it 5 aes 


On the reasonable assumption that /,, dominates this expression, for an 
extremal y it is clear that /,, must vanish. This ensures the validity of 


the Euler-Lagrange equations for y. To determine whether the extremal 
is amaximum or a minimum, Legendre looked at the sign of /,,. After 


some work, here omitted, he concluded that the extremal will bea 
minimum if 
07 
i > 0. 
Oy’? 


This criterion is easy to use, but its derivation was troubling. 
Legendre produced a transformation of the integral /,, that converted it 


into an the integral 


b Of ; 
h = { | eal (A(x, y,y)dx, 


from which the conclusion follows immediately. But the function h is 
found by solving the differential equation 


PPh Ne er: 
(oF) ae | 7 fa ‘ | es 


for the unknown function aq, y, z, f. However, Legendre had produced 


no general method for solving this non-linear equation, and 
Lagrange was able to show that solutions may not exist on the whole 
interval [a, b]. 

The next advance was made by Jacobi, in his major paper [150]— 
almost 50 years after Legendre’s. In it, he showed how to use a solution 
to the Euler-Lagrange equations for y to obtain a solution to Legendre’s 
differential equation. In particular, he was able to show that the 
existence of a solution to the Euler-Lagrange equations stands or falls 
with the existence of solutions to a transformed version of Legendre’s 
differential equation that never vanish in the interval (a, b). But, for 
whatever reason, Jacobi gave no proofs of his results in this paper, and 
it, therefore, generated a considerable amount of research by German 
mathematicians of the next generation, chiefly Otto Hesse, Rudolf 
Clebsch, and Adolph Mayer, before its conclusions were considered fully 
established in the mid-1850s. 


26.3 Weierstrass’s Theory 


Such were the difficulties inherent in the calculus of variations, and 
perhaps also such was the naiveté about the varied nature of functions 
throughout much of the nineteenth century, that it is only with 
Weierstrass’s lectures at the University of Berlin in 1879 and thereafter 
that some fundamental issues were confronted for the first time 

(Fig. 26.1). 


Fig. 26.1 Karl Weierstrass (1815-1897) by Conrad Fehr 1895 


One concerns the very idea of a variation, or more precisely, a small 
variation. If one attempts with goodwill to copy the graph of a function, 
say for definiteness y = sin x in the interval x + dx, one surely draws a 


smooth curve that looks very like the original even though it may agree 
with it at very few points. However, it is easy to convince oneself 
intellectually that there is, for example, a very wriggly curve that 
crosses the sine curve very many times and is always closer to the sine 
curve than the first smooth approximation that was drawn. Which of 
these curves should be considered as the smaller variation of the sine 
curve? 


As put, Lagrange’s theory gives no reason to hesitate: the 
comparison is made only between curves of the first kind, whose values 
and slopes differ little. But as sophistication about curves grew, 
mathematicians realised that it might be advisable to decide whether 
rapidly oscillating curves were close to slowly oscillating ones, or if the 
marked variation in corresponding values of y’(x) should be taken into 


account. In particular, the English mathematician Isaac Todhunter was 
worried by the possibility of a few abrupt changes in values of y’(x), 


but he did not produce a systematic theory to cope with the difficulties 
that he had identified. 
Weierstrass, who was alert to the great difference between 
differentiable and merely continuous functions, went much further. 
; b 
Given an integral J of the form ; _ [ flay, y")dx and a function dF y 


that is a solution of the corresponding Euler-Lagrange equation, 
Weierstrass considered the curve described by y and a comparison 


curve y(x) that agreed with j at x = 0) and again at a point P where 
du = pdx + qdy where they crossed.’ So their slopes will differ at 


x = 0. He considered the excess function 


O 
HES = 7659) -f65)- SoC V0? =). 


This allowed him to consider comparison curves with different slopes, 
and he was able to show that it is necessary and (he believed) also 
sufficient for the curve ¥ to be a minimum that 


E(x, ¥,¥,y’) = 0. 


Weierstrass’s analysis began to clarify the nature of the comparison 
curves, because it opened the way to curves that oscillate much more 
than the extremal. Later, Adolf Kneser was able to show that if a 
minimiser of an integral is sought only among curves that differ from an 


extremal only slightly in respect of both their values and the values of 
their derivatives, then Weierstrass’s necessary and sufficient conditions 
for a minimum are correct. But if comparison curves may differ in the 
values of the derivatives, then Weierstrass’s condition is not sufficient. 
He called the first case the “weak” variation and the second case the 
“strong” variation. 

In his lectures at Berlin, Weierstrass developed his theory in the 
formalism of parameterised curves. This had the effect of making it 
harder to use in any but geometrical problems, but that has the 
incidental effect of making it easier to describe in general terms. 

At issue is the search for a curve in the plane that joins the points 
(—R,0) and (—R, 0) and minimises a given integral J. The curves 


through these points will be considered in the form 
(x(t), y(t)), O < t < 1. Given one such curve a nearby curve will be 


written as 
(x(t) + €@), yO + mt), Ost 1, 
where &(0) = &(1) = 0 = 7(0) = (1). (Precisely what conditions can 


be imposed on the curves and their variations soon became a topic of 

research—here we shall assume continuously twice differentiable.) 
The variation in the integral J can be written down and analysed, 

and the conclusion is that F\, must vanish for all admissible functions vy 


and n. This implies Weierstrass’s form of the Euler- 


Lagrange equations: 


d d 
F,-—Fy =0, Fy - —Fy = 0. 
dt ae; | as 


These equations are not independent, however, and are equivalent to 
the differential equation 


P yy = Pyty i FiGe y = xy’) = 0, 


where F = F(x, y, x’, y’) satisfies 


Pyry — 9 Fi: Fyy — —x'y'F), Fyy — ee 
This function, which Weierstrass showed exists, is “infinite” when x’ 
and y’ vanish simultaneously. Curves (x(t), y(t)), 0 < t < | satisfying 


this differential equation are extremals for the problem. A further 
equation is needed to determine the functions x(t) and y(t) precisely, 
should that be necessary. 

The method works well for finding shortest curve on a surface and 
joining two given points when the surface is given as a graph over a 
region of the plane, and so in the form z = f(x, y). The integral to be 


minimised is 
1 
t= i: (Eu? +2Fu'v’ + Gv’)"dt, 
0 


where déy = 0 and m — ov. Here, as usual in differential geometry, 


with r(u, v) = (u, v, Z(u, V)), 


B= Tat. fF jatat iG Hitt: 


As is also usual in differential geometry, the resulting differential 
equation looks intimidating at first but has a conceptually simple 
interpretation: the acceleration of the minimising curve with respect to 
parameterisation by arc length (the principal normal to the curve) is 
normal to the surface. This means that in terms of the intrinsic 
geometry of the surface there is no acceleration, and so the curve is as 
“straight” as it can be, which makes it a geodesic as required. 

Weierstrass’s method extended to the second variation, and 
encompassed both Legendre’s and Jacobi’s conclusions, and resulted in 
necessary and sufficient conditions for both weak and strong minima. 
The method shows, for example, that in the above example the curve is 
indeed of shortest length between the given end points. 


26.3.1 Two Examples 


I take these examples from Bolza ([19], 128-129, 210-211). The first 
requires the above characterisation of geodesics, and the second one is 
straightforward. 


a) Example XI: To determine the curve of shortest length which can be 
drawn on a given surface between two given points. 

If the rectangular coordinates x, y, z of a point of the surface are given as 
functions of two parameters u, v and the curves on the surface are 
expressed in parameter-representation 


u=O(t),v=Wt) (27) 


the problem is to minimise the integral 
l 
j= { VEu? + 2Fu’v’ + Gv’dt, 
0 


where 


a er Sy ee C= x 


the summation sign referring to a cyclic permutation of x, y, z. 

The curves must be restricted to such a portion S of the surface that 
the correspondence between S and its image T in the u-, v-plane is a 
one-to-one correspondence. We further suppose that E, F, G are of class 
C” in T and that S is free from singular points, i.e. 


dy + tdx = 0, 
a) If we use Weierstrass’s form (I) of Euler’s equation, and denote by 


( — 1 the differential expression 


OF) = Fy — Pye +Piey —x’y), 


we obtain easily 


T 
O( VEu2 + 2Fu’v’ + Gv?) = —__ (28), 
VEu? + 2Fu'v’ + Gv? 


where 


1 1 
T=(EG — F*)(u’v" — u"v’')+(Eu' + FV\(Fun Eyal" +G,u'v' + saw") 


1 1 
~(Fu’ — Gv GE” + Bi Ey = 5Gwv"). (29) 


The extremals satisfy, therefore, the differential equation 
0 (29a). 


This differential equation admits of a simple geometrical 
interpretation: The geodesic curvature of the surface (27) at the point t 
is given by the expression 


] T 
Pg 7 VEG — F2 VEu” + 2Fu'v + Gy23 


Hence the curve of shortest length has the characteristic property that 
its geodesic curvature is constantly zero, i.e. it is a geodesic. 

For the second example I quote without proof that extremals of an 
integral 


|e fs 
J=- i (xy’ — x’y)dt 
2 Sis 
subject to the condition 
Ke { a{x'2 + ydt 
10 
is constant are found by defining H = F + AG and solving the equation 


Alyy = Ayy a Ay(x’y” i xy’) = 0, 


where 


N, = Ayy/y? = —Hyy/x'y’ = Hyy/x?. 
b) Example XIII: Among all curves of given length joining two given 
points A and B, determine the one which, together with the chord AB, 
bounds the maximum area. [This is Dido’s problem.] 


Taking the straight line join A and B for the x-axis, with BA for 
positive direction, we have to maximise the integral 


i qe 
J== { (xy’ — x’y)dt, 
pa 


while 


1 
Ke { al x2 + ydt 
10 


has a given value, say /, which we suppose greater than the distance AB. 


Since 
1 / / 
H = =(xy’ — xy) +Aa{x? + y”? 


A 


trys 


we get 


A, =+ 


and therefore [...| 
my = ey _ | 
[2 + y/23 . me 


Hence the radius of curvature of the maximising curve is constant and 
has the value |[A|, while its direction is determined by the sign of 2. 


Again, since Do never vanishes, there can be no corners, and 
therefore the curve must be an arc of a circle of radius |A|. The centre 


and the radius of the circle are determined by the conditions that the 
arc shall pass through the two given points and shall have length /. 


There are two arcs satisfying these conditions, symmetrical with 
respect to the x-axis. 


26.4 Hilbert’s Problem 23 and the Theory of 


the Calculus of Variations 


By 1900 David Hilbert (Fig. 26.2) was the agreed new leader of 
mathematics in Germany. He was the most powerful figure in the 
growing collection of highly talented mathematicians that Felix 
Klein was bringing together in Gottingen, an authority at successive 
stages in his career on algebraic invariant theory, algebraic number 
theory, and plane geometry—a surprising choice that allowed him to 
produce a new branch of mathematics, the study of axiom systems.° 


Also in 1900, the French had managed to stage the largest 
celebration of the new century: 6 months of Congresses on various 
topics. Philosophy, physics, and mathematics had week-long meetings 
in August; Poincaré spoke at all three. But the Congress of 
Mathematicians is famous for Hilbert’s contribution. Hilbert’s close 


friend Hermann Minkowski had suggested to Hilbert that he speak on 
the future of mathematics as a particularly appropriate subject, and 
that is what Hilbert did. He offered a general panorama on the dialogue 
between problems and theory, giving Fermat’s last theorem and the 
brachistochrone problem as his key examples of provocative problems 
that had produced important new theories. Then he presented some 23 
problems (10 in his address at the congress, all 23 in his published 
paper). They were on a great variety of topics, and over the years, 
helped by the prestige of Gottingen, they became celebrated, and 
reputations were made for solving one. 

The last five were on analysis, indicative of Hilbert’s shift to a new 
interest (he was occupied with functional analysis for most of the next 
few years, research that provides the origin of “Hilbert space”). 
Hilbert’s 23rd and final problem in his list in Paris connected to the 
opening words of his lecture, where he spoke of the importance of the 
brachistochrone problem and went on to remark that* 


..for example, the problem of the shortest line plays a chief and 
historically important part in the foundations of geometry, in the 
theory of lines and surfaces, in mechanics and in the calculus of 
variations. 


A little later in his address he returned to the topic and made the 
striking remark that 


It is an error to believe that rigor in the proof is the enemy of 
simplicity. On the contrary, we find it confirmed by numerous 
examples that the rigorous method is at the same time the 
simpler and the more easily comprehended. The very effort for 
rigor forces us to discover simpler methods of proof. It also 
frequently leads the way to methods which are more capable of 
development than the old methods of less rigor. 


He offered some examples and then continued 


But the most striking example of my statement is the calculus of 
variations. The treatment of the first and second variations of 
definite integrals required in part extremely complicated 


calculations, and the processes applied by the old 
mathematicians lacked the necessary rigor. Weierstrass showed 
us the way to a new and sure foundation of the calculus of 
variations. By the examples of the simple and double integral | 
will show briefly, at the close of my lecture, how this way leads 
at once to a surprising simplification of the calculus of 
variations. For in the demonstration of the necessary and 
sufficient criteria for the occurrence of a maximum and 
minimum, the calculation of the second variation and in part, 
indeed, the tiresome reasoning connected with the first 
variation may be completely dispensed with to say nothing of 
the advance which is involved in the removal of the restriction to 
variations for which the differential coefficients of the function 
vary only slightly. 


Then, at the end of his lecture Hilbert outlined the problems posed by 
the calculus of variations in these terms. What he says is not easy to 
follow, and it will be enough to go straight to his conclusion; note only 
that he presented a rigorous and ingenious argument to his conclusion. 
In fact, although his reputation has become that of a highly abstract 
thinker, his contemporaries regarded him as a problem-solver first and 
foremost, and without getting into the details of his new presentation 
of the calculus of variations you can see that his definition of k€ 


exhibits the cunning of a problem-solver more than the sweep of a 
theorist. 

His approach soon became an accepted part of the theory. The 
connection to Hamilton-Jacobi theory was appreciated, but so was the 
simplicity of the derivation. It can be found, for example, in Osgood’s 
paper [208], Bolza’s book ([19], 92) and in the 15-page article by 
Ernest Zermelo and Hans Hahn on recent further developments in the 
calculus of variations that they published in the Encyklopadie der 
Mathematischen Wissenschaften in 1904. Hilbert himself gave a more 
detailed account in his [144]. 


23. FURTHER DEVELOPMENT OF THE METHODS OF THE 
CALCULUS OF VARIATIONS. 


So far, I have generally mentioned problems as definite and 
special as possible, in the opinion that it is just such definite and 
special problems that attract us the most and from which the 
most lasting influence is often exerted upon science. 
Nevertheless, I should like to close with a general problem, 
namely with the indication of a branch of mathematics 
repeatedly mentioned in this lecture - which, in spite of the 
considerable advance Weierstrass has recently given it, does not 
receive the general appreciation which, in my opinion, is its due 
~ I mean the calculus of variations. 

The lack of interest in this is perhaps due in part to the need 
of reliable modern text books. So much the more praiseworthy is 
it then that A. Kneser, in a work published very recently, has 
treated the calculus of variations from the modern points of 
view and with regard to the modern demand for rigor. 

The calculus of variations is, in the widest sense, the theory 
of the variation of functions, and as such appears as a necessary 
extension of the differential and integral calculus. In this sense, 
Poincaré’s investigations of the three body problem, for example, 
form a chapter in the calculus of variations, in so far as 
Poincaré derived from known orbits by the principle of variation 
new orbits of similar character. 

I add here a short justification of the general remarks upon 
the calculus of variations made at the beginning of my lecture. 

The simplest problem in the calculus of variations proper is 
known to consist in finding a function y of a variable x such that 
the definite integral 


b 
te) LOY: 0)dke. Y = 
a Xx 


assumes a minimum value compared with the values it takes 
when y is replaced by other functions of x with the same initial 
and final values. 

The vanishing of the first variation in the usual sense 


éJ =0 


gives for the desired function y the well-known differential 
equation 


caine F< 0, (26.2) 
dx 
[Fy.=5 6 Fy= $l 


In order to investigate more closely the necessary and 
sufficient criteria for the occurrence of the required minimum, 
we consider the integral 


b 
ae { (ERO2= Pp) aa, 
a 
LF = F(p,y, x), Fp = SG] 


Op 


Now we inquire how p is to be chosen, as function of x, yin 
order that the value of this integral k£ shall be independent of 


the path of integration, i. e., of the choice of the function y of the 
variable x. The integral kf has the form 


b 
J = { (Ay, — B)dx, 


where A and B do not contain v, , and the vanishing of the first 
variation 
oJ* = 0, 
in the sense which the new question requires, gives the equation 
[see below] 
OA OB 
—+—— =, 
Ox Oy 


i. e. we obtain for the function p of the two variables x, y the 
partial differential equation of the first order 


OF» ODF 5 =F) (26.3) 
— + ———_ = 0. 
Ox Oy 
The ordinary differential equation of the second order (26.2) 
and the partial differential equation (26.3) stand in the closest 
relation to each other. This relation becomes immediately clear 
to us by the following simple transformation 


b 
6J* = { (Foy + F,op + (yx — Sp)Fy + (vx — p)OF p) dx 
b 
= { (F sOy POV 5 & Ve= por ) dx 


b 
= 0J + { Oy — p)oF pdx. 


We derive from this, namely, the following facts: If we construct 
any simple family of integral curves of the ordinary differential 
equation (26.2) of the second order and then form an ordinary 
differential equation of the first order 


VeS PX; 9) (26.4) 


which also admits these integral curves as solutions, then the 
function p(x, y) is always an integral of the partial differential 
equation (26.3) of the first order; and conversely, if p(x, y) 
denotes any solution of the partial differential equation (26.3) of 
the first order, all the non-singular integrals of the ordinary 
differential equation (26.4) of the first order are at the same 
time integrals of the differential equation (26.2) of the second 
order, or in short if X — x9 = H is an integral equation of the 


first order of the differential equation (26.2) of the second order, 
p(x, y) represents an integral of the partial differential 

equation (26.3) and conversely; the integral curves of the 
ordinary differential equation of the second order are therefore, 


at the same time, the characteristics of the partial differential 
equation (26.3) of the first order. 

In the present case we may find the same result by means of 
a simple calculation [see below]; for this gives us the differential 
equations (26.2) and (26.3) in question in the form 


VuxP yyy + VaP yy + Fy. — Fy = 9, 


(Det DIT pp PF pp EP pg Fy = 9, 


where the lower indices indicate the partial derivatives with 
respect to I1(Q) = 1. The correctness of the affirmed relation is 


clear from this. 

The close relation derived before and just proved between 
the ordinary differential equation (26.2) of the second order and 
the partial differential equation (26.3)) of the first order, is, as it 
seems to me, of fundamental significance for the calculus of 
variations. For, from the fact that the integral kf is independent 


of the path of integration it follows that 


b b 
{ (F(p) + Ox — p)Fp(p))dx = { F(¥x)dx (26.5) 


if we think of the left hand integral as taken along any path y and 
the right hand integral along an integral curve y of the 


differential equation 
zZ=axtby. 


With the help of Eq. (26.5) we arrive at Weierstrass’s formula 


b b b 
{ Foods | F(¥,)dx = { EQx, pdx, (26.6) 


where E designates Weierstrass’s expression, depending upon 


IO) = 1, 


EQ, p) = F(x) — F(@) — Ox — p)F p(p). 


Since, therefore, the solution depends only on finding an integral 
p(x, y) which is single valued and continuous in a certain 
neighborhood of the integral curve y, which we are considering, 


the developments just indicated lead immediately - without the 
introduction of the second variation, but only by the application 
of the polar process to the differential equation (26.2) - to the 
expression of Jacobi’s condition and to the answer to the 
question: How far this condition of Jacobi’s in conjunction with 
Weierstrass’s condition r = r; is necessary and sufficient for the 


occurrence of a minimum. 


Hilbert ended his discussion of this problem with some remarks about 
Kneser’s approach to Weierstrass’s theory, which led to a partial 
differential equation that could be considered a generalisation of the 
Hamilton-Jacobi equation. 
I add a few lines on the derivation of the equation 
OA OB 
— ee 
Ox dy 
If we set G = Ay, — B then G, = A,,, — By and doy = 0, so 


0, 


d 
oo Pes = Ayy, — By — Ay — Ayy,, 


and so (note that y’ = y,) 
os. = A, 4 By, 


as Hilbert said. 
I also add a few lines on the derivation of the equation 


(Pet DP) i ot Pl tt pee 1 = 0, 


This comes about through a change of variable argument, in which the 
variables x and y are replaced by the variables x and p. 


26.5 Exercises 
Questions 


1. 
I am not entirely sure why Hilbert was so interested in the calculus 


of variations around 1900; one possibility is that it was the 
disparity between its importance and its complexity that intrigued 
him. But it is also possible that it was minimising principles like the 
law of least action that had caught his attention. In 1898-99 he 
lectured on mechanics at Gottingen, and topics included “the 
energy conservation principle, the principle of virtual velocities 
and the d'Alembert principle, the principles of straightest path and 
of minimal constraint, and the principles of Hamilton and Jacobi” 
and their logical and conceptual inter-relations.° Can you form an 
assessment of these principles and the relationships between 
them? 


Footnotes 


1 An excellent historical account, going into more detail than is possible here, is Fraser [107]. 
2 The bar notation for the extremal is due to Kneser; it was not used by Weierstrass. 
3 As Adolf Hurwitz, his former student, remarked. 


4 For accounts of Hilbert’s Paris problems, their origins and their influence down the twentieth 
century, see Gray [125] and Yandell [275]. 


5 The quote comes from Corry ([48], 93). 
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27.1 Introduction 


In the nineteenth century the wave equation, the heat equation, and 
Laplace’s equation (the Dirichlet problem) were solved. Or rather, and 
more accurately, they were solved for a wide range of domains and 
initial conditions. But there was a fundamental lack of clarity about the 
initial conditions for these equations, and almost nothing was known of 
the possibility of standard methods, such as the method of 
characteristics, for dealing with more general equations. Even the now- 
standard classification of second-order linear partial differential 
equations into three types (elliptic, parabolic, hyperbolic) was only 
established in 1889, in a paper by Paul du Bois-Reymond. 

In the 1890s Poincaré shook up the subject of partial differential 
equations with new methods, and shed light on the question of suitable 
initial conditions. We shall consider only his account only of the Laplace 
equation, which was to lead Hadamard to some surprising insights into 
the existence and uniqueness of solutions of partial differential 
equations. But Poincaré wrote widely on many aspects of partial 
differential equations and applied mathematics; his discussions of 
eigenvalue problems being a notable success there is not room to 
discuss in this book.' 


27.2 The Classical Classification of Linear 


Partial Differential Equations 


The now-standard division of linear second-order partial differential 
equations into elliptic, parabolic, and hyperbolic seems to have been 
introduced surprisingly late, in a paper Paul du Bois- 

Reymond published in the Journal ftir die reine und angewandte 
Mathematik on the subject in 1889. By the time the paper appeared he 
had died, at the age of 57. 

He had been introduced to mathematical physics by Franz 
Neumann, who taught him about fluids at the University of Zurich, and 
the subject became the topic of his Ph.D. at Berlin in 1859. He then 
pursued a career in mathematics, and eventually became a Professor at 
the Technical University in Berlin in 1884. Although some criticised his 
work for lack of rigour, he made a number of interesting discoveries 
about Fourier series representations, and about the growth of functions 
and infinite numbers. 

The non-degenerate second-order linear partial differential 
equation with constant coefficients for an unknown function u in two 
variables x and y can be reduced to one of three forms: 


Uxx + Uyy + 2auy, + 2buy +c = d; 
Uxx + Uy + 2au, + 2buy +c = d; 


Ux, + Uy + 2Zauy +c = d, 


where a, b, c, d are constants. It is also easy to see that introducing the 
new variable 


jae Oy 
further reduces these equations to 


Vxx + Vyy + kv = Te, y); 


Vax + Vyy + kv = f(x,y); 


Vyx + Vy = f (X,Y), 


where k is a constant. 
Du Bois-Reymond wrote the general partial differential equation of 
the type he was considering in the form 


F(z) =Rr+Ss+Tt+Pp+Qq+2Zz=0, 


where p, q, ’, Ss, t, have their usual meanings as the various first (p, q) 
and second (r, s, t) derivatives of z, and R, S, T, P, Q, Z are sufficiently 
differentiable functions of x and y. 

Differentiation gave him equations such as 


dp = rdx + sdy, dq = sdx + tdy, 
and so on, and du Bois-Reymond deduced that 
Rdpdy + Tdgqdx — s(Rdy’ — Sdxdy + Tdx*) + Mdxdy =0, (27.1) 
where II = Pp + Qq+ Zs. 


He applied these equations to an arbitrary curve C on a solution 
surface, which necessarily satisfied these equations: 


dz = pdx + qdy, dp = rdx + sdy, dq = sdx + tdy, F = 0, 


and observed that two of Z, p, g, r, s, (remain arbitrary on C but once 
they are given the other four can be found by integration. The simplest 
assumption to make, he said, is that z and one of p or q is given 
arbitrarily, and that z and the tangent plane to the surface are given 
along a curve C. More precisely, he said, no more than two of z, p, q can 
be arbitrary, and for curves that project to a given plane curve only one 
is at our disposal. 

Consider, however, when the equation of the curve is such that 


Rdy’ — Sdxdy + Tdx’ = 0. (27.2) 


This equation can be written in the form 


(dy — N,dx)(dy — N_dx) = 0, 
where Ne& ~ (s 2 4/52 ART) and, he remarked, the solutions to 


the equations 


x+z+alog(a— z) = constant. 


define two families of curves that cross at every point of the (x, y)-plane 
when S52 —4RT > (). These curves are the projections onto the (x, y)- 


plane of the characteristics of the partial differential equation cos A.. 


The situation on an arbitrary curve and a characteristic curve are 
very different, du Bois-Reymond pointed out, in terms of what is known 
along them, and this is connected, he went on, to the question of what 
has to be given before an integral surface is determined. He had 
investigated this matter in an earlier paper, he said, and here he 
preferred to draw attention to “the most important point” (p. 245), 
which was how widely different the boundary conditions are for real 
and imaginary characteristics. In the case of positive characteristics 
(real characteristic curves) a small change in the curve C changes the 
surface inside the region bounded by C and the characteristics at its end 
points (du Bois-Reymond thought of these three curves as forming a 
triangle). But with imaginary characteristics, when $2 — 4RT > (0), the 


entire solution surface is determined by an arbitrarily small part of the 
curve C. In the case of real characteristics du Bois-Reymond noted that 
if one attempts to define the surface initially along a characteristic, then 
one needs an extra arbitrary function that is not required when starting 
with a curve that is not a characteristic (one might say that the moral is 
that characteristics make bad boundary curves). 

After discussing a number of other topics, du Bois-Reymond turned 
in Chap. 4 of his paper to the reduction of second-order linear partial 
differential equations to canonical form. He showed that a linear 
change of variables x and y can have these consequences: 


1. When §2 = 4R7T, if one of R or T vanishes then so does S, and the 


equation can be made to take the form 


Rr+Pp+Qq+Zz=0, orTt+ Pp+Qq+Zz=0. 


When §2 — 4RT > 0, then either S or both R and T can be made to 


vanish, and the equation can be made to take the form 


Ss+Pp+Qq+2Zz=0, orRr—-Tt+ Pp+Qq+Zz=0. 


When §2 — 4RT > 0, then S can be made to vanish, but not R and 


T, and when S vanishes, the new R and T will have the same sign, 
and the equation can be made to take the form 


Rr+Tt+Pp+Qq+2Zz=0. 


At this point he wrote (1889, 265): 


I shall call the differential equations in the two first forms 
parabolic, the second two hyperbolic, and the third form elliptic. 


So any linear, second-order partial differential equation can be reduced 
to one of these forms, provided the condition on R, S, and T is satisfied, 
and the task of the theory of such equations is to study the solutions 
appropriate to each form. 

In Chap. 6 of the paper, du Bois-Reymond offered a proof that a 
hyperbolic (linear, second-order) partial differential equation can be 
solved when one is given an arc x + y that meets no characteristic 


more than once, and along which z and p or z and q (and thus z, p, and q 
are given), and the solution holds in the region bounded by the arc and 
the characteristics through P; and P;. He argued that the equation can 


be taken in the form 


F(z)=st+upt+vqt+wz=0, 


when the characteristics are U;_1 = U2; and y = const.. In this case, 


the key result is that the solution of the partial differential equation is 
known in a rectangle when it is known on two adjacent sides, and the 
data is continuous at the common corner. Discontinuities on either 
edge will propagate into the rectangle. But he admitted that his 
argument was intuitive and inconclusive, and that a rigorous proof 
would be hard to find. All he had was a power series argument for 
which a good convergence result was lacking. 

He ended his paper several pages later, remarking that 


In this article, I have pulled together the few cases where, so far 
as I can see, the principal integral can be written down at once. 
Leaving aside a few particular cases, such as Riemann’s equation 
for the propagation of sound, it seems that from now on new 
and more general methods must be found, that I will explain in 
future papers. 


27.3 Poincaré and the Dirichlet Problem 


In 1890 Poincaré wrote a paper [216] on the partial differential 
equations of mathematical physics that became much quoted. He 
began by observing that a number of problems in physics— 
electrostatics, electrodynamics, the propagation of heat, optics, 
elasticity, and hydrodynamics—all lead to the same family of partial 
differential equations. Among these is Laplace’s equation, which raises 
the Dirichlet problem, where many different boundary conditions can 
be handled by the method of Green’s functions, but there are several 
others. After surveying them, he went on 


Unfortunately, the first property common to all these problems 
is their extreme difficulty. Not only can one often not solve them 
completely, but it is only at the price of the greatest effort that 
one can rigorously prove the possibility. 


So, he asked himself, is all this hard work necessary? After all, most 
physicists do very well, guided by their experiments. But, he said, 
analysis ought to be able to do it, and a rigorous proof that a problem 


can be solved may be quite unsuited to providing numerical estimates 
but it teaches us something. Should we nonetheless relax the demand 
for rigour, on the grounds that the differential equations themselves 
have often been established by less than rigorous arguments, and 
experimental results are necessarily approximate? He rejected this too: 
how can one decide if a less than rigorous argument is valid? Who has 
the right to say that an argument insufficient for a mathematician is 
good enough for a physicist? Moreover, he concluded, it is hard to give 
up a problem that has not been completely solved, and some of these 
equations also play a role in pure analysis (in Riemann’s work, for 
example). 

Poincaré then set out a new method for solving the Dirichlet 
problem; the physical context made him cast the problem in three 
dimensions, which was a significant mathematical advance. He 
remarked that the problem was known to always admit a solution, and 
that Riemann had proved this—remarks that no German contemporary 
in the field would have accepted, even in spirit. Then he noted, more 
securely, that solutions had been provided by Schwarz and Carl 
Neumann in Germany and Robin in France. After that, he presented his 
own solution, known as the method of “sweeping out” (“balayage” in 
French), which has something in common with Schwarz’s alternating 
method. 

Poincaré used Green’s theorem, which says that a solution to a 
Dirichlet problem is obtained by finding a suitable Green’s function. He 
also used a rigorous version of another of Green’s theorems to show 
that given a positive electric charge at a point inside a virtual sphere 
one can arrange for a charge distribution on the sphere with the 
property that the potential functions of the two distributions agree 
outside the virtual sphere. 

We can give an informal argument to this effect. It is trivially true of 
a point charge at the centre of a virtual sphere: the potential outside the 
sphere is the same as that of a uniform charge distribution on the 
sphere. If we now move the point, keeping it inside the virtual sphere, 
we expect that the charge distribution on the sphere can vary in sucha 
way as to produce a new potential function equal to that of the charge 
in its new position. 


Poincaré gave the formulae for this, but left it to the reader to check; 
in fact this had been done first by Thomson and then repeated by 
Maxwell in his great book on electromagnetic theory.’ 

More specifically, Poincaré showed that a Green’s function u relative 
to a sphere of radius R and concentrated at P is the potential function 
associated to a unit mass placed at P inside the sphere and a mass of 

/% placed at Q outside the sphere. The contribution to dU ofa 
OP 


dn 
point M on the sphere is given by 
R*-OP* 1 
R MP3 
If the point P is inside the sphere, the potential of a unit charge at P is 
equal to op at P. 


If, therefore, a unit charge is distributed over the sphere in sucha 
way that the charge density at each point M on the sphere varies 
inversely with MP, then, Poincaré argued that the potential W of this 
distribution equals ra at points M’ outside the sphere and at points 


M” inside the sphere the potential is less that or From this he 


deduced that a function equal to the Green’s function U inside the 
sphere and zero outside the sphere is harmonic everywhere except at P 
and on the surface of the sphere, where it is continuous but its normal 
derivative jumps by a - This function is equal to the potential 
zZ=y—. 
Vb 
function associated to a unit charge at the point P and the potential 
function associated to a charge density on the sphere given by 
R? — OP?, which differs from the formula above only by a change of 


~ AnR.MP3 


sign. Therefore the function V that is harmonic inside the sphere and 


takes prescribed values on the sphere, given by a function Vy”, is 


defined by the equation 
t VR? SOP?) 


VM” _— 
my s4tR MPS 


Importantly, its maximum and minimum values lie between those of the 
given V®, which can be assumed to be positive. Our earlier informal 


argument could not guarantee this crucial detail, which will be used to 
guarantee the existence of a lower bound later on. 

Poincaré now showed how to solve the Dirichlet problem for the 
region R outside an isolated charged conductor of an arbitrary shape, 
provided only that it has a tangent plane at every point and distinct 
principal curvatures—conditions that enabled Poincaré to establish 
some convergence arguments. First, he indicated briefly that the region 
R outside the conductor can be covered by an infinite number of 
spheres Sj, j = l, 2,... of various sizes so that each point of the region 


R lies in at least one of these spheres. Then he proposed to define a 
harmonic function on the region R that tends to the value 1 at points P 
on the surface of the conductor and that tends to the value 0 as P moves 
off to infinity. 

To do so, he observed that given a sphere a; and the electric charge 


it contained, the charge distribution can be switched with an equivalent 
one entirely on the surface of the sphere. This has no effect on the 
potential outside the sphere, which is unchanged, and reduces the 
potential inside the sphere. He called this operation “sweeping out” the 
sphere. 

He started with a large external sphere » that surrounds the 
conductor and has a uniform charge distribution on it that gives rise to 
a potential vz outside the sphere (the potential goes to zero at infinity) 


and a constant potential of 1 everywhere inside it, including the 
conductor. Essentially his argument is that the spheres are swept out, 
lowering the potential function by a sequence of harmonic functions 
inside them but not altering the potential outside them. So the potential 


function continues to take the value 1 on the boundary of the conductor 
but otherwise drops. It cannot become negative, because all the charges 
introduced are positive, so by Harnack’s theorem—if an increasing or 
decreasing sequence of harmonic functions is bounded it tends toa 
limit that is a harmonic function—the limit is everywhere harmonic 
and it is unaltered on the boundary of the conductor, so it continues to 
take the value 1 there. 

In slightly more detail, at least one of the spheres a; meets & and 


contains some of the charge distributed over &. Poincaré let z,,, be 
such a sphere and swept it out. The potential function becomes vz, and 
there is no charge inside z,,,. He now swept out Z,,,,an operation that 
can put some charge back inside Z,,,. Now he swept out Z,,, and Z,y. 
Then he turned to z,,,,and so on. To sweep out every sphere infinitely 
often, Poincaré swept them out in this order 

S 15.9 959 1309 90 329 [59 259 9504s haw 
If the nth sweeping out operation empties the sphere Ff), the potential 
resulting function vz agrees with the preceding one, V,_;, outside Fy 
and inside Fy itis less: ZN = OC. So everywhere one has ZN = OC. 
Because a negative charge never occurs, the decreasing sequence of vz 


s is bounded below at every point and so tends to a limit, a function 
Poincaré called V. 
Consider now the jth sphere a, which is swept out infinitely often, 


say at times a;,,k = 1,2,.... Each time there is no charge in its interior 
and the corresponding potential function V,, is harmonic. Now, the 
sequence of values of the V,, tends to a limit and so Poincaré used 


Harnack’s theorem from 1887 to deduce that the limit function Vis also 
harmonic. But every point of the region R lies in at least one sphere so, 


said Poincaré, there is a harmonic function defined everywhere on R. 
Because everywhere one has Vo > V > O, and vz tends to zero at 


points arbitrarily far from the conductor, it follows that V tends to zero 
at points arbitrarily far from the conductor. 

To show that the potential function V(P) tends to 1 as the point P 
tends to a point M, say, on the conductor, Poincaré invoked his 
assumptions on the shape of the conductor to allow him to define a 
sphere that touches the conductor at M and otherwise lies entirely 
inside the conductor. A limiting argument allowed him to show that the 
potential function on a sequence of points tending to M from outside 
the conductor tends to 1, as required. He concluded that the Dirichlet 
principle had been established; it would have been more accurate to 
say that the Dirichlet problem had been solved. 

Poincaré then spent some time weakening the conditions he had 
imposed on the boundary of the conductor, so that it could, for example, 
have a finite number of cone points. 

Then he defended having introduced a new method into an already 
popular field, although it was no better than those of Robin or 
Neumann and in some cases actually slower. But he argued that no 
known method allowed one to go beyond the first approximation 
without calculations that were too repellent, and so the skilled analyst 
will welcome a new method, and his, he remarked, was particularly 
elastic (“if I may use the term’—[216], 231). He then proceeded to 
show how it can be adapted in various ways. 


27.4 Exercises 
Questions 


1. 
Du Bois-Reymond’s classification of linear second-order partial 


differential equations is the formal face of a fundamental division. 
His presentation emphasises the varying nature and therefore role 
of the associated characteristic curves. Reach the same 
classification by looking at boundary or initial data for these 
equations. 


Footnotes 
1 See Verhulst ([260], Chap. 11). 


2 The method involves inversion in spheres. Inversion in a sphere S with centre O and radius R 
maps a point P to the point Q on the half-line from O through P and such that OP.OQ = R? (the 


map is not defined at 0). The map switches P and Q, and therefore switches the inside and the 
outside of the sphere; it is an anti-conformal map (like a reflection). In the plane it is an 
inversion in a circle (see Chap. F). A harmonic function is transformed by an inversion of its 
domain to another harmonic function with its singular point somewhere else. Traces of this 
process are visible in Poincaré’s map. 


3 Note that if the sphere is inverted into a plane, P and Q become mirror image points; see 
Sect. . 
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28.1 Introduction 


This chapter, and Chap. 29 on hyperbolic equations, concludes this 
history of differential equations. Topics that emerge of considerable 
importance are the regularity of the solutions of elliptic equations— 
this was a particular interest of David Hilbert’s—and the introduction 
of more rigorous methods in potential theory. 


28.2 Picard on Second-Order Linear Elliptic 


Equations 


In a paper in the Journal de Mathématiques for 1890, Picard considered 
the linear second-order partial differential equation 
Oru Oru Oru Ou Ou 


A—+2B C = F(u,—, —, 
v Ox Oy 


Ox? Oxdy r ay2 xy), 


in which the coefficients are functions of x and y in a domain for which 
B? — AC < (. He showed that the solutions are determined by their 


values on the boundary of the domain, provided the domain is suitably 
small (a condition that ensures that the solution is single-valued and is 
therefore a function of x and y). 

He took the equation in the form 


AU = Uxx + Uyy = FU, Ux, Uy, x,y). 

To solve it, he took an arbitrary function | — K p and formed the 
equation 

(u? + v7)ds* = da? + dp”, 
which he supposed had the solution | — Kp. He then formed the 
equation 

(u? + v°)ds* = da’ + dp’, 
which he supposed had the solution | — Kp, and so on. Each solution is 


fixed by its values on the boundary of the domain. If the sequence of 
functions Z,, = P(x, y) converges to a function u(x, y) then the limit 


function would be a solution of the equation 
Au = F(Uu, Ux, Uy, X,Y), 


as required. The issue therefore is to find conditions that guarantee the 
convergence of the sequence of functions. 
Picard began with the case where 


F(u, Ux, Uy, X,Y) = Au, + buy +c, 


where a, b, and c are functions of x and y. Then he dealt with the general 
case, and then with the special case where F is a function of u, x, and y 
alone, and is also an increasing function of u, when he showed that no 
restrictions on the domain are necessary. His method was to establish 
the theorem for small contours, and then extend it to arbitrary ones by 
looking at overlapping contours, as in Schwarz’s alternating method, to 
which he explicitly referred. 


This case included the partial differential equation 
Ux, + Uy = A(x, ye", 


to which Picard paid special attention. As Liouville had showed, this is 
the equation for a surface with metric E = G = e”, F = 0 to have its 


curvature given by the function A(x, y). 

Finally, Picard observed that the same method of successive 
approximations could be used to prove the existence of solutions to 
ordinary differential equations.' 

We can look a small way past the introduction and glimpse the 
subtleties of the problem. One case that Picard looked at was the 
equation 

Ou Fu 
ae t oo f(x,y). 


In this case, the solution of the partial differential equation is 


1 
u(é, 4) = = ii { F(x, G(x, y, &, mydxdy, 
where the double integral is taken over the given surface and Gisa 
Green’s function that becomes infinite at the point (€,77) of the surface 
like log(1/r) and vanishes on the boundary.” 


Picard investigated this solution and found that it was necessary to 
ensure an upper bound on oe and m This he could do when the 
x Xx 


boundary was either a circle or a curve analytically equivalent to a 
circle. 
For the more complicated equation 


ou + os + buy + 

—+—=au uy + ¢, 

Ox? — Oy? . ? 

where dt = dx + idz is continuous, it was not possible to write down 


the solution, and so an iterative approach had to be used. Picard now 


showed that the sequence of approximations converged provided the 
boundary curve was small, by which he meant that the solution 
function u(x, y) has no chance to grow uncontrollably and thereby cease 
to be single-valued. 

In a paper he then published in the Journal de I’Ecole Polytechnique 
in the same year, 1890, Picard now imposed the condition that the 
coefficients are analytic functions of x and y and were able to show that 
in this case the solution is also analytic.? His method was the method of 
successive approximations. 


28.3 Hilbert’s Problems 19 and 20 


This and the next extract are taken from the published version of 
Hilbert’s address on the problems of mathematics at the ICM in Paris in 
1900, ([143]). 


19. ARE THE SOLUTIONS OF REGULAR PROBLEMS IN THE 
CALCULUS OF VARIATIONS ALWAYS NECESSARILY ANALYTIC ? 

One of the most remarkable facts in the elements of the 
theory of analytic functions appears to me to be this: That there 
exist partial differential equations whose integrals are all of 
necessity analytic functions of the independent variables, that is, 
in short, equations susceptible of none but analytic solutions. 
The best known partial differential equations of this kind are the 
potential equation 


aa in a Of 
ae Ox2 


and certain linear differential equations investigated by 
Picard [J. Ec Poly 1890]; also the equation 


Of o f ee 
ga” Ox2 


the partial differential equation of minimal surfaces, and others. 
Most of these partial differential equations have the common 


=0 


characteristic of being the Lagrangian differential equations of 
certain problems of variation, viz. , of such problems of variation 


{ f(p.9g, 25 x, y)dxdy = minimum 


Oz Oz 


[p= 3 ae 


as Satisfy, for all values of the arguments which fall within the 
range of discussion, the inequality 


PFPF (PF) 
ap ag? \apaq) * 


fitself being an analytic function. We shall call this sort of 
problem a regular variation problem. It is chiefly the regular 
variation problems that play a réle in geometry, in mechanics, 
and in mathematical physics; and the question naturally arises, 
whether all solutions of regular variation problems must 
necessarily be analytic functions. In other words, does every 
Lagrangian partial differential equation of a regular variation 
problem have the property of admitting analytic integrals 
exclusively? And is this the case even when the function is 
constrained to assume, as, e. g., in Dirichlet’s problem on the 
potential function, boundary values which are continuous, but 
not analytic? 

I may add that there exist surfaces of constant negative 
Gaussian curvature which are representable by functions that 
are continuous and possess indeed all the derivatives, and yet 
are not analytic; while on the other hand it is probable that 
every surface whose Gaussian curvature is constant and positive 
is necessarily an analytic surface. And we know that the surfaces 
of positive constant curvature are most closely related to this 
regular variation problem : To pass through a closed curve in 
space a surface of minimal area which shall enclose, in 
connection with a fixed surface through the same closed curve, a 
volume of given magnitude. 


Hilbert then went on 


20. THE GENERAL PROBLEM OF BOUNDARY VALUES. 

An important problem closely connected with the foregoing 
is the question concerning the existence of solutions of partial 
differential equations when the values on the boundary of the 
region are prescribed. This problem is solved in the main by the 
keen methods of H. A. Schwarz, C. Neumann, and Poincaré for 
the differential equation of the potential. These methods, 
however, seem to be generally not capable of direct extension to 
the case where along the boundary there are prescribed either 
the differential coefficients or any relations between these and 
the values of the function. Nor can they be extended 
immediately to the case where the inquiry is not for potential 
surfaces but, say, for surfaces of least area, or surfaces of 
constant positive Gaussian curvature, which are to pass through 
a prescribed twisted curve or to stretch over a given ring 
surface. It is my conviction that it will be possible to prove these 
existence theorems by means ofa general principle whose 
nature is indicated by Dirichlet’s principle. This general 
principle will then perhaps enable us to approach the question : 
Has not every regular variation problem a solution, provided 
certain assumptions regarding the given boundary conditions are 
satisfied (say that the functions concerned in these boundary 
conditions are continuous and have in sections one or more 
derivatives), and provided also if need be that the notion of a 
solution shall be suitably extended?|Cf. my lecture on Dirichlet’s 
principle in the Jahresbericht der Deutschen Math.-Vereinigung, 
vol. 8 (1900), p. 184.] 


Hilbert then published two papers on the Dirichlet problem. The first is 
an indication of what he developed at length in the second. It would 
take us too far afield to follow him and to sort out what he did, and did 
not, achieve with them, but we can see how he intended to elucidate his 
programmatic remarks in Paris by looking at an extract from the first 


paper ([142]). 


The Dirichlet principle is a method that Dirichlet, drawing on an 
idea of Gauss, used to solve the so-called boundary value 
problem, and which can be briefly characterised in the following 
way. One erects verticals at the points of the given boundary 
curve in the (x, y)-plane and gives them the corresponding 
boundary values. On the surface z = f(x,y) that is bounded by 


this curve one looks for a surface the minimises the value of the 


ue ff 


This surface, as one can easily see using the calculus of 
variations, is necessarily a potential surface. With the use of 
considerations of this kind Riemann gave a proof of the 
existence of the solution to the boundary value problem and 
then immediately based his great theory of Abelian functions 
upon it. 

It was first recognised by Weierstrass that this use of the 
method of the Dirichlet principle is not sound; indeed, if only a 
finite number of numerical values are given one can conclude 
without further ado that there must be a least numerical value 
among them; from an unbounded number of numerical values 
one cannot conclude that there is least one; rather, it requires a 
proof that in the given case there is a surface z = f(x,y) which 


gives the least value of the integral /(/f). 

The important researches of C. Neumann, H.A. Schwarz and 
H. Poincaré have shown that under certain very general 
assumptions about the nature of the boundary curve and the 
boundary values the boundary value problem is solvable and 
therefore the existence of a minimal function /[{x, y) is assured. 

The Dirichlet principle owes its fame to the attractive 
simplicity of its fundamental mathematical idea, to the 
undeniable richness of its possible applications to pure and to 
physical mathematics and its intrinsic plausibility. But since 
Weierstrass’s critique the Dirichlet principle became of 


historical value only and seemed to lose its ability to lead to 
solutions of the boundary value problem. C. Neumann spoke 
regretfully that the so beautiful and so much-used Dirichlet 
principle would now always decline; only A. Brill and M. Noether 
called for new hope to grow in us and expressed the conviction 
that the Dirichlet principle, present in nature, could once more 
enjoy a revival, perhaps in a modified fashion. 

The following is an attempt at a revival of the Dirichlet 
principle. 

Inasmuch as we think of the Dirichlet problem as only a 
particular problem in the calculus of variations, we have been 
led to express it in the following more general form. Every 
regular problem in the calculus of variations has a solution 
provided suitable restrictions are imposed on the given 
boundary conditions and if necessary the idea of a solution is 
suitably extended.’ 

How this principle can be used as a guide to the discovery of 
rigorous and simple existence proofs will be shown by the 
following two examples: 

I. Draw the shortest curve between two given points P and 
P; ona given surface z = f(x, y). 


Let € be the lower bound on all curves on the surface 


between the two points. From the totality of all connecting 
curves we look for those curves C), Co, C3,... whose lengths 


y= f(x+ct) aaa approach the limit &. On mj; we draw 


from Pa length —- > and obtain on m, the point Py 1} 0n Mm; we 
draw from P a length oF and obtain on m; the point Pit 1;0n 
mj; we draw from P a length HE and obtain on m; the point 


Pit, and so on. The points Pdx = Rdz + Qdy have an 


accumulation point S', which is also a point of the surface 
Z= f(x,y). 

This procedure, which we have applied to Pand P, alike, 
and has led to the point S$, we now apply to the points P and 
S_ and obtain in this way the point /(@) on the given surface, 
and also the point /(@) when we apply the procedure to S, and 
P,.In the same way we find the points 
P 1/8, P3/g, Ps/g, P7/s, Piji6,.--- All these points and their 


accumulation points taken together form a continuous curve 
that is the sought-for shortest curve. 

The proof of this fact is easily found when one thinks of the 
length of a curve as defined as the limiting value if the lengths of 
inscribed polygons. As we see at the same time, it is necessary 
for this approach that we assume that the given function f(x, y) 
and its first differential quotients with respect to x and y are 
continuous. 


Hilbert’s second example was that of the Dirichlet problem itself. 
Unfortunately, his argument is too long to reproduce here. 


28.4 Exercises 
Questions 


1. 
Hilbert’s interest in the regularity of solutions to partial differential 


equations is perhaps a pure mathematician’s attitude. Do you 
agree? What are the implications for physics of his conjecture 
(when it is proved, as it shortly was)? 


Footnotes 


1 The method had been used earlier by Liouville in connection with Sturm-Liouville theory, 
see Lutzen ([192], Chap. X). 


2 Later, in §4 of his paper, he showed that u will exist if fis not even required to be continuous, 
but for u to be twice differentiable it is necessary that fbe continuously once differentiable or at 
least satisfy some sort of Hélder condition (not to be discussed here). 


3 He defined an analytic function of two variables to be one that can be written as a convergent 
power series in the variables. 


4 Hilbert here footnoted his Paris address and the papers ([14, 15]). 


© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 
J. Gray, Change and Variations, Springer Undergraduate Mathematics Series 
https://doi.org/10.1007/978-3-030-70575-6_29 


29. Initial Value Conditions for 
Hyperbolic Partial Differential 
Equations 


Jeremy Gray! 
(1) School of Mathematics and Statistics, Open University, Milton 
Keynes, UK 


Jeremy Gray 
Email: jeremy.gray@open.ac.uk 


29.1 Introduction 


By the late nineteenth century it was becoming clear that solutions to 
hyperbolic partial differential equations have a particular kind of 
relation to the initial conditions that can be imposed on them. This had 
become increasingly clear in the later decades of the nineteenth 
century, as the work of Picard and others show; the person who cleared 
this up decisively was the French mathematician Jacques Hadamard in 
the early years of the twentieth century, who made a powerfully 
provocative study of the relation between elliptic and hyperbolic partial 
differential equations and between boundary and initial conditions. 


29.2 Picard on Second-Order Linear 


Hyperbolic Equations 


In the same paper in the Journal de Mathématiques for 1890 that we 
have already looked at, Picard showed how to use the method of 


successive approximations to solve hyperbolic second-order partial 
differential equations. One of his examples, the equation 


07z ae: Oz 


oxy "Oa + ba + cz (29.1) 


where a, b, c are functions of x and y, is reproduced in Sect. 31.8, but 
here it is more worth repeating his analysis of why his method for 
solving hyperbolic partial differential equations would not work for 
elliptic ones. He has posed the problem of solving the partial 
differential equation for which the solution is to take a prescribed 
value at a point on an arc AB and its first partial derivatives are to take 
prescribed values along the arc. 


It is quite otherwise when the characteristics are imaginary. To 
see this, it suffices to take the simple example of the equation 


Oz Oz 


=(). 
Ox2 vO 


In general one cannot have a solution of this equation that is 

continuous in the rectangle ABA’B’ along with its first-order 

partial derivatives, and for which “ and x take on the arc AB 
x hy 


the succession of values denoted above by y(a) and y(a), these 


functions being subject to no other condition than being 
continuous. In the contrary case one could, in effect, form an 
analytic function p = K that will be holomorphic in the 


rectangle under consideration, the real part of this function 
being arbitrary on the curve AB, which is impossible because a 
holomorphic function determined on an arc of a curve however 
small can only be extended in a unique way. 


29.3 Hadamard and Mathematical Physics 


Jacques Hadamard (Fig. 29.1) was a remarkable analyst, the first (with 
de la Vallée Poussin, independently) to prove the prime number 


theorem, which says that the number of prime numbers less than a real 
number x is well approximated by x/ In x, but his particular fields were 


integral equations, the calculus of variations, and partial differential 
equations. 

But before we proceed, I cannot resist this anecdote, which dates 
from a scientific Jubilee in honour of Hadamard in 1937. Picard had 
wondered if Hadamard would still remember his lectures on rational 
mechanics and Hadamard replied!: 


It is perfectly true that you had accepted the task - should I say 
the burden - of involving us in that artificial and lamentably 
monotonous exercise that is the problem of mechanics for the 
degree. You had been able to render it almost interesting; I 
always asked myself how you were able to do that, because I was 
never able to when it was my turn. 


In his paper [132] he made some important remarks about the two 
types of partial differential equation that typically arise in physics: the 
Dirichlet problem and the Cauchy problem. In the Dirichlet problem in 
which the unknown function (say, a function of two variables) is 
required to satisfy a given condition at each point on the boundary of 
its domain. In the Cauchy problem, the boundary information is the 
value of a function and one of its first derivatives at each point on some 
boundary. 


Fig. 29.1 Jacques Hadamard (1865-1963) 


“These problems’, he said, “are presented in every sort of question 
in mathematical physics. However, there is an extensive list of cases in 
which one or the other is presented as if it is well posed, I want to say 
as possible and determined”. What he wished to point out, he said, was 
that “these two circumstances are intimately related, the one to the 
other, and this in a sufficiently close way that of the two problems, 
entirely analogous in appearance, one can be possible and the other 
impossible according to how they correspond or not to a physical 
given’. 

Hadamard’s language needs a little unpicking. By a possible 
problem, he meant one admitting a solution, and by a determined 
problem one that has a unique solution. His discovery was that the two 
boundary conditions work very differently, even though the equations 
look very similar, and that one problem may have a solution when the 
other does not depend on whether the boundary conditions make 
physical sense or not. 


He illustrated his point with two examples. Laplace’s equation (A) in 
three dimensions leads to Dirichlet’s problem, which is a possible and 
determined problem. On the other hand, the Cauchy problem for 
equation (A) asks for the determination for x = O of a solution such 


that, for x = 0, 


Ou ; 
u = Up, —— = Uo 
> Ax > 


where qd; and uo’ are given functions of y, z. “This problem” he said (p. 


214), “which has no physical significance, can always be solved when 
a, and uy’ are analytic, but we know today that it is quite otherwise in 


the general case”, and he gave a brief explanation of why this was so. 
Suppose, for example, that a; has been defined on a circle C in the 


plane x = 0 and the hemisphere S in the region x = 0 bounded by that 


circle and within which the Dirichlet problem is known to be solved. 
The unknown function u will be determined by the values it takes on C 
and S, because they define a region for which the Dirichlet problem is 
known to be solved. The part corresponding to the values on S defines 
an analytic function of x, y, z inside C, so one can say that W is the 
potential of a double layer distributed in the (y, z) plane, the thickness 
(density?) of this double layer being represented at each point by ay. 


However, the Cauchy problem is only possible if 


is an analytic function. 
Then he considered the wave equation (B). The Cauchy problem in 
this case asks, find the solution for ¢ = 0 such that when f¢ = 0 


Ou ; 
u=U and —=U, 
Ot 


Uand U’ being given functions of x, y, z. Such a problem is, in general, 


possible and determined, he said. The solution is given by Poisson’s 
formula 

o[U](t) 
u x, 99 [) = 
(x, ¥, 2 2) Wt 


where [U] and [U’] are the mean values of Uand U’ on the sphere 


+[U'](), 


centre (x, y, Zz) and radius t. 

But, said Hadamard, one should not infer that the Cauchy 
problem for the equation is always possible and determined. That is the 
case when tis taken as the principal variable, but it is false for the same 
equation not when the principal variable is x, y, or z. For example, 
taking it with respect to x and with functions a; and uy’ independent 


of t problem (B) now reduces to problem (A) and is therefore 
impossible in general. 

For example, taking x as the principal variable, the Cauchy 
problem asks for the solution for x = 0 such that when x = 0 


Ou ; 
u=U and —=U, 
Ot 


Uand U’ being given functions of y, z, and t. Suppose, he said, that the 
functions U and U’ are independent of t. In this case, if the solution is 


unique then u will certainly be independent of t, but this reduces 

problem (B) to problem (A), which we have just seen is impossible. 
Could there, instead, be infinitely many solutions u that take the 

same value on x = 0 and for which also the values of oe are the same? 


If so, then there is a solution u of problem (B) that is not identically zero 
but vanishes on x = 0 along with a Any such solution can be defined 


in the region x = 0 by the simple formula u(—x, y, z, t) = u(x, y, z,t).A 


consideration of the implications of Poisson’s formula allowed 


Hadamard to prove that the only solution of (B) that on x = O satisfies 
uy = O = ug is the function that vanishes everywhere. Therefore the 


Cauchy problem in this case cannot be indeterminate and it follows that 
it is in general impossible. 
Hadamard concluded that the Cauchy problems for t = 0 and x = 0 


are very different, and the problem in the second case is much closer to 
the theory of equations with imaginary characteristics. 


29.4 The Cauchy Problem 


Hadamard began his book by introducing Cauchy’s use of boundary 
conditions when solving a second-order, linear partial differential 
equation, and he concluded the first chapter by writing (§14): 


The result of Cauchy’s and Sophie Kowalewsky’s analysis would 
therefore be that Cauchy’s problem has one (and only one) 
solution every time the surface which bears the data is not 
characteristic, nor tangent anywhere to a characteristic. 
(emphasis Hadamard’s) 


But then he immediately went on in Chapter Two to say that in fact the 
true situation was not so simple, and indeed was almost paradoxical. 


The reasonings of Cauchy, S. Kowalewsky and Darboux, the 
equivalent of which has been given above, are perfectly rigorous; 
nevertheless, their conclusion must not be considered as an 
entirely general one. The reason for this lies in the hypothesis, 
made above, that Cauchy’s data, as well as the coefficients of the 
equations, are expressed by analytic functions; and the theorem 
is very often likely to be false when this hypothesis is not 
satisfied. [...]| Indeed, one of the most curious facts in this theory 
is that apparently very slightly different equations behave in 
quite opposite ways in this matter. 


To defend his position, he compared the way Cauchy data and Dirichlet 
data work. There are occasions when Cauchy’s approach to a second- 


order partial differential equation, which involves specifying initial data 
in the form of values of the solution function and its first derivatives on 
a hypersurface is valid without any requirement of analyticity. 
(Hadamard defined an analytic function on an interval as one admitting 
a power series expansion.) 

In contrast, the Dirichlet problem for a region requires only that the 
boundary values of the solution function be specified. For a region V 
bounded by a surface S 


It is a known fact that this problem is correctly set: i.e. it has one 
(and only one) solution. This fact immediately appears as 
contradictory to Cauchy—-Kowalewsky’s theorem: for, if the 
knowledge of numerical values of u, at the points of S (together 
with the partial differential equation) is by itself sufficient to 
determine the unknown function within V, we evidently have no 
right to impose upon u any additional condition, and we cannot 
therefore, besides values of u, choose arbitrarily those of a. 


To understand the deep reason for this, Hadamard noted but set aside 
the fact that the data in a Cauchy problem and a Dirichlet problem are 
specified on topologically distinct regions. He thought it more 
important that Cauchy data can only supply a solution ina 
neighbourhood of the hypersurface, whereas the Dirichlet data leads to 
a solution valid throughout the enclosed region. 

He then argued that in fact if data on even a small part S ofa 
hypersurface is non-analytic there will be no solution of Laplace’s 
equation valid in a neighbourhood of S. For if there were there would be 
harmonic functions defined on either side of S with the same normal 
derivatives at each point of S. But this means that the two harmonic 
functions are analytic extensions of each other, and so their values on S 
must be analytic, contrary to assumption. 

The way out of this paradox would come, he suggested (§15), by 
following Poincaré’s advice, for 


No question offers a more striking illustration of the ideas which 
Poincaré developed at the first International Mathematical 


Congress at Zurich, 1897 (see also La Valeur de la Science, pp. 
137-155), viz. that it is physical applications which show us the 
important problems we have to set, and that again Physics 
foreshadows the solutions. 


Hadamard then gave a more detailed and technical examination of the 
nature of boundary data, before returning to his broad theme. This was 
that theorems proved for analytic functions may not be true when more 
general types of function are considered and that the physical 
interpretation of the problem is a sure guide to whether Cauchy data or 
Dirichlet boundary data are appropriate. He strongly suggested (§18) 
that 


This remarkable agreement between the two points of view 
appears to me as an evidence that the attitude which we 
adopted above - that is, making a rule not to assume analyticity 
of data - agrees better with the true and inner nature of things 
than Cauchy’s and his successors’ previous conception. 


There then followed one of Hadamard’s more famous observations, that 
is worth savouring for its own sake. He had in mind a theorem of 
Weierstrass’s that ensured that any continuous function may be 
approximated arbitrarily well by an analytic function. This being the 
case, why not replace a non-analytic partial differential equation and 
non-analytic data with very good analytic approximations? Surely this 
will produce arbitrarily good approximations to the solution of the 
original non-analytic problem? 

Hadamard remarked (§18): 


I have often maintained, against different geometers, the 
importance of this distinction. Some of them indeed argued that 
you may always consider any functions as analytic, as, in the 
contrary case, they could be approximated with any required 
precision by analytic ones. But, in my opinion, this objection 
would not apply, the question not being whether such an 
approximation would alter the data very little, but whether it 
would alter the solution very little. It is easy to see that, in the 
case we are dealing with, the two are not at all equivalent. 


Let us take the classic equation of two-dimensional 
potentials 


Ou Ou 
— +, = 0, 
Ox? dy? 
with the following data of Cauchy’s 
Ou ; 
(15) u(O,y) = 0, 700 y) = u,(y) = A, sin(ny), 


n being a very large number, but P; a function of n assumed to 
be very small as n grows very large (for instance A, =n”). 


These data differ from zero as little as can be wished. 


The Dirichlet problem with the boundary data 
Ou 
u(O, y) = 0, 5,0 y) =0 
- 


has the unique solution u = u(x, y). As for the new problem, with 


boundary data differing from zero by an arbitrarily small amount, 
Hadamard continued (see Fig. 29.2): 


Nevertheless, a Cauchy problem has for its solution 
An. ; 
u = — sin(ny) sinh(nx), 
n 


which, if A, =+ J. e~ Vv is very large for any determinate value 
n? nP? 


of x different from zero on account of the mode of growth of yo; 
and consequently sinh(7x). 
In this case, the presence of the factor sinny produces a 


“fluting” of the surface, and we see that this fluting, however 
imperceptible in the immediate neighbourhood of the y-axis, 


becomes enormous at any given distance of it however small, 
provided the fluting be taken sufficiently thin by taking n 
sufficiently great. 


2 my ; E 
2 
Fig. 29.2 The graph of + sin(ny) sinh(nx), -17 < y<7,0<x<2,n= 10 


After some more technical matters Hadamard then observed 


21. Another paradoxical consequence furthermore appears if we 
consider things from the concrete point of view. 

Strictly, mathematically speaking, we have seen (this is 
Holmgren’s theorem) that one set of Cauchy’s data a — | 


corresponds (at most) to one solution of [Laplace’s equation], so 
that, if these quantities ~ — 1 were “known,” u would be 


determined without any possible ambiguity.” 

But, in any concrete application, “known,” of course, signifies 
“known with a certain approximation,” all kinds of errors being 
possible, provided their magnitude remains smaller than a 
certain quantity; and, on the other hand, we have seen that the 
mere replacing of the value zero for a;, by the (however small) 


value (15) changes the solution not by very small but by very 
great quantities. Everything takes place, physically speaking, as 


if the knowledge of Cauchy’s data would not determine the 
unknown function. 

This shows how very differently things behave in this case 
and in those which correspond to physical questions. Ifa 
physical phenomenon were to be dependent on such an 
analytical problem as Cauchy’s for Y2y = 0, it would appear to 


us as being governed by pure chance (which, since Poincaré, has 
been known to consist precisely in such a discontinuity in 
determinism) and not obeying any law whatever. 

After having been led by physical interpretation to the need 
of the above distinctions, we must now try to formulate them 
analytically. This is subordinate to the classification of linear 
partial differential equations of the second order into different 


types. 


These are the hyperbolic, parabolic, and elliptic types, but, because 
Hadamard always emphasised the importance of working in any 
number m of variables, he distinguished among the hyperbolic types 
between those in which all but one of the m squares have the same sign 
—which he called the normal hyperbolic type—and the others. 

He then observed that the normal hyperbolic type is the only one 
known in which Cauchy’s problem can be correctly set, and the non- 
normal hyperbolic types are not known to be connected to any physical 
problem and do not lead to any problem known to comply with 
Cauchy’s condition. Finally, for reasons Hadamard explained later in the 
book, elliptic equations never lead to correctly set Cauchy problems. 


29.4.1 Commentary and Concluding Remarks 


Hadamard’s work, which we have done little more than sample here, 
established three things. First, that any theory of partial differential 
equations deals not only with a differential equation but with some 
boundary or initial conditions. Second, that elliptic and hyperbolic 
equations are very different in this respect (and that parabolic 
equations exhibit some features of each type). Third, that there is a 
class of partial differential equations that are what he called well posed: 
they have solutions, these solutions are unique, and they depend 


continuously on the initial data and any parameters that enter the 
problem. 

The first point makes clear what Kovalevskaya seems to have 
suspected, and Riemann quite likely understood, that the solution of a 
partial differential equation is not some general expression that is 
made precise when some extra information is supplied (as in the theory 
of ordinary differential equations). This is a natural view, it was held by 
Euler and Lagrange, and it is ultimately shallow. The theory of partial 
differential equations is instead a dialogue between the equation and 
its boundaries. 

The second point was surely understood by Riemann, but 
Hadamard’s insight can be amplified here. Solutions to a hyperbolic 
partial differential equation propagate at a given speed that reflects 
some aspect of the situation being described; solutions to elliptic 
equations propagate instantaneously. The other side of that coin is that 
the solution to an elliptic equation at a point depends on all the 
boundary values, but the solution to a hyperbolic equation at a point 
depends only on the nearby boundary values. 

The third point is Hadamard’s most original. He believed that the 
partial differential equations that arise in science are well posed and 
this is why they can be profitably studied, and that problems that are 
not well posed (ill-posed, as they are called) are likely to be both 
difficult and artificial. Although it is true that today some naturally 
occurring ill-posed problems are studied, Hadamard’s observation is 
deeper than it looks and may yet have useful things to say. 


29.5 Exercises 
Questions 


1. Hadamard’s remarks brought finally into light a fundamental 
failure of the first generations of people who studied partial 
differential equations, in that he showed that these equations 
cannot be studied without the accompanying boundary conditions. 
To what extent does this course suggest that boundary conditions 
were initially almost ignored (they were to be fitted in after the 
equation was solved), then incorporated, then appreciated (and 
given equal status with the equation)? 


Footnotes 
1 See Cartwright ([33], 77). 


2 Holmgren published this theorem in [146]. 
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30.1 Revision and Assessment 3 


This chapter is given over to revision and discussion of the final 
assignment, see H.4. 

However, I would like to repeat my recommendation that students 
read the essay [156], which restates and reinvigorates many of the 
concerns that surfaced towards the end of the nineteenth century in 
partial differential equation theory. In particular, there is a stimulating 
return to the concerns that animated Hadamard and Poincaré. 


© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 
J. Gray, Change and Variations, Springer Undergraduate Mathematics Series 
https://doi.org/10.1007/978-3-030-70575-6_31 


31. Translations 


Jeremy Gray! 
(1) School of Mathematics and Statistics, Open University, Milton 
Keynes, UK 


Jeremy Gray 
Email: jeremy.gray@open.ac.uk 


31.1 Cauchy: Note on the Integration of First- 
Order Partial Differential Equations in Any 


Number of Variables 


This is [34], in Oeuvres (2) 2, 238). 

Until now there has been no treatise on the differential and integral 
calculus where one is given the means to integrate completely partial 
differential equations of the first order in any number of independent 
variables. Having been occupied for several months with this object, | 
was happy to have obtained a general method appropriate to fulfilling 
this desire. But, having finished my work, I learned that M. Pfaff, a 
German geometer, had been led on his side to the solution of the 
equations mentioned above. As this concerns one of the most important 
questions in the integral calculus, and M. Pfaff’s method is different 
from mine I believe that geometers will not be without interest in a 
short analysis of one and the other. I will first expound the method that 
I have used, profiting, in order to simplify the exposition, from some 
remarks made by M. Coriolis, an engineer at Ponts et Chaussées, and 
some others that have since occurred to me. 

Suppose in the first place that we are to integrate a first-order 
partial differential equation with two independent variables. One 


already has several methods for integrating an equation of this kind, of 
which one (due to M. Ampére) is based on the change ofa single 
independent variable. The method that I propose, based on the same 
principle as in the admitted hypotheses, reduces to this: 

Let 


dz = p(dx + Vdy). (31.1) 


be the given equation, in which x and y denote the two independent 
variables, u an unknown function of these two variables, and p, g the 
partial derivatives of u relative to the variables x and y. In order to 
completely determine the sought-for function u it is not enough to 
know that it must satisfy Eq. (31.1); itis also necessary that it satisfies 
another condition, for example, that it yields a certain particular value 
for a function y for a given value of the variable x. Let us suppose in 
consequence that the function u must receive, for p(x, y), the particular 


value ¢(y): the function q or the partial derivative of u relative to y, will 
on this hypothesis receive the particular value y’(y). On the same 


hypothesis the general value of u is, as one knows, completely 
determined. It now remains to calculate this value: one can proceed in 
the following manner. 

Let us replace y by a function of x and a new independent variable 
v,. The quantities u, p, g, being functions of x and y, become themselves 


functions of x and v;; and on differentiating on this supposition,! 


=prt 31.2 
a aa 31.3 
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If one takes one of these two equations from the other, after 
differentiating the first with respect to v; and the second with respect 


to x, one finds that 


op _ 0q Oy dy Og (31.4) 


Oyo OxOyo OxOdyo 


If one also writes the total differential of the first member of Eq. (31.1) 
as 


Xdx + Ydy + Udu + Pdp + Qdgq = 0. 


one finds, on differentiating this equation with respect to vj, that 


Y—— + U— + P— +0 = 0. (31.5) 
0 


and consequently, in view of Eqs. (31.3) and (31.4) that 
O Oy\ O 

ay (0 7 P> 84 _¢ 
Yo Yo 


ax (31.6) 
Let us now observe that the value of y as a function of x and 1; being 


[rae +5 
YoU +P 
Ox 


entirely arbitrary one can dispose of it in such a way that it satisfies the 
differential equation 


0 
O= .. - 0, (31.7) 


and reduces to v; on the particular supposition p(x, y). The value of y 
at x and 0Z being chosen in the way just described, the particular 
values of u and q corresponding to p(x, y), that is to say y(y) and y’(y), 
become, respectively x + ly and y’(yo). Representing these values by 


a — | one will have 


Up = (Yo), Go = ¥'(Vo). (31.8) 
As for formula (31.6), it is reduced by Eq. (31.7) to 


Oq\ Oy 
Vig +p |—— 20 
( “a i) on 


and as, y depending on v, by hypothesis, a cannot be constantly zero, 
v0 


the same formula becomes 


0 
Y¥+qU +P =0. (31.9) 
Ox 
This done, the integration of Eq. (31.1) is reduced to the following 
question: Find for y, u, p, gq four functions of x and 0Z which satisfy the 


Eqs. (31.1), (31.2),(31.3), (81.7), (31.9), and of which three, namely, 
y, u, q, reduce, respectively, to yo, Uo, go on the supposition L(x, y). 


We do not speak of Eq. (31.4), because it is a necessary consequence 
of Eqs. (31.2) and (31.3). As for the particular value of p corresponding 
to u(x, y), it will not enter into the general values of y, u, p, q 


determined by the preceding conditions. If one denotes it by Pg it will 
be deduced from the formula’ 
F(X0, Yo, Uos Po» Jo) = O. (31.10) 


It is essential to remark that the general values of y, u, p, q as functions 
of x and v,;, remain completely determined if, among the conditions 


that they must satisfy one fails to take account of the verification of 
Eq. (31.3). This last condition must therefore be an immediate 
consequence of all the others. To show this, let us suppose for a 
moment that the other conditions having been verified, the two 
members of Eq. (31.3) are unequal. The difference between these two 
members can only be a function of x and v,. Let a be this function and 


@, be what it becomes when u(x, y). One will have 


(31.11) 


Ou O Ou 0 , ; 
a=—- ae a = = — go? = v'(90) — '(V0) = 0. 
Yo Oyo 


Consequently, instead of Eqs. (31.3) and (31.4), one finds 
Ou go Op Oqody Oy0oq 0a 


dy 1dyo  ” Byy  Oxdyo  OxOyo | Ox’ (31.12) 


then, instead of (31.6) the following: 
p2)\ 99 


Op\ Oy Ow 
Y+4+qU + P—|— Oi ake EO Pog a ( 
+qU+ a 1 (0 PP | rs +Uat+ Pa (31.13) 


This last equation will reduce, by Eqs. (31.7) and (31.9), which one 
supposes verified, to 


U Fay 0 31.14 
Qa — _—\VQ. : 
ax (31.14) 


On integrating it, and treating = as a function of x and vj, one will find 


= aye J Pa), (31.15) 


and consequently, taking account of the second of the Eqs. (31.11), one 
will generally have 


a =p. (31.16) 


The two members of Eq. (31.3) cannot, therefore, be unequal on the 
admitted hypothesis. One must conclude from this that the quantities 
y, U, p, q Satisfy all the conditions required if these quantities, 
considered as functions of x satisfy Eqs. (31.1), (31.2), (31.7), (31.9), 
and if in addition y, u, g reduce, respectively, to v}, u = p+ig,and 


do = ¢’(yo) for E(x, y). It is useless to add that on the same supposition 


p must obtain the particular value po; in fact this value will not be 


contained in the integrals of Eqs. (31.1), (31.2), (31.7), (31.9), because 
none of these equations contain we. 
ry 


If, in Eq. (31.2), one substitutes the value of a drawn from 
x 


Eq. (31.7), one will find 
Ou Qq_ ~Pp+Qq 
— =p+— = ———. 
Ox P ig 
Furthermore, if one differentiates Eq. (31.1) with respect to x, one 
obtains the following: 
Oy Ou 


OP | 94 _ 
Ree ee a a, = (31.18) 


which the values of - ee a drawn from Eqs. (31.7), (31.17), 
x? Ox? Ox 


(31.17) 


and (31.9) reduce to 


Op 
Apu PP =), 31.19 
P ax ( ) 
This done, one can substitute Eq. (31.17) in Eq. (31.2), and Eq. (31.19) 
in one of Eqs. (31.1), (31.17), (31.7), (31.9). If besides one observes 
that, in the case where one considers y, u, p, g as functions of x alone, 
one can include Eqs. (31.7), (31.9), (31.17), (31.19) in the algebraic 


formula® 
dx dy du dp dq 


PO Pp+0; XapU Yequ 120) 


one definitively concludes that to determine the sought-for values of 
the quantities y, u, p, q, it is enough to work with four of the five 
equations contained in the two formulae 


dx ad du d d 
Ss E ‘;(31.21) 


9a Ms bs =), = = SS SS 
IOI.Ys Po) P OQ Ppt+Qq  X+pU Yq 


and to know, for p(x, y) the particulars yg, Uo, Po, Jo, for the three latter 


ones are determined as a function of the first by Eqs. (31.8) 
and (31.10). 
Suppose, to fix ideas, that by means of the equation 
p(xt f'(p)) = 0. 

one eliminates p from three equations in the formula 
dx dy du dq 
ee es (31.22) 
P Q Pp+Qq Y+quU 


On integrating the last three, on will obtain three finite equations that 
involve, with the quantities 


X,Y, U,q 
the particular values represented by 
2+ tye l. 
If after the integration one eliminates q, the remaining two equations 
involve, with the quantities x, y, u, and the constant quantity xo, only 
the new variable v,, the elimination of which can only be carried out 


when one has assigned a particular form to the arbitrary function 
denoted by y. Whatever it may be, the system of two equations with 


which we are concerned can always be considered as equivalent to the 
general integral of Eq. (31.1). 

As, in all that has been done so far, one can substitute the variable x 
for the variable y, and reciprocally, it follows that the integrals of 
Eqs. (31.21) again furnish a solution of the question proposed, if one in 
the integrals one considers v; as constant, Xo as a new variable that 


one must eliminate, and (37/4, 1) as functions of this new variable that 


are determined by equations of the form 
(31.23) 


Uy = (Xo), Po = ¥ (Xo) 


F(X, Yo. Uo, Po» Jo) = 9. (31.24) 


Let us apply the principles we have just established to the solution of 
the partial differential equation 


px +qy = 0. (31.25) 


[The extract from Cauchy [34] ends here. ] 
Cauchy showed that in this case that Eqs. (31.21) become 


1 
pdx = qdy = atu = xdp = ydq. (31.26) 
These imply that 
d dx d d 
a ead od = am du = -2xdx = fovdy: (31.27) 
P xX 4 y x a 


then on integrating and taking note of the condition Pogo = xoyo 


a Sa 31.28 
Po 0 Go Yo (enee) 

P02 9 2 qo ,.2 P] yO 7-3 2 X02 2 

U— Uy = —(X" — Xo) = —O" — Yo) = —O — XH) = —O" — Yo). 
Xo 0 Yo 0 Go 0 D0 0/-(31.29) 


In these equations, he said, xo is an arbitrary constant and vj anew 


variable that one can only eliminate after fixing a value for the arbitrary 
function ¢. Finally Cauchy deduced that the general integral is 


represented by the equations 
(u— Y(4o))” = ? = x9)" — yo), (u — G%0)) G'(X0) = x0” — ¥9)s 


and he noted that the second of these equations is the derivative of the 
first with respect to Xo. 


Cauchy concluded his paper with the observation that the method 
worked without essential change when there were more than two 
independent variables, and illustrated his point by going over the 
method in the case of three independent variables. He also worked 
through the example 


pqv = xyz. 


Comment The change of variables argument at the start of Cauchy’s 
paper may be easier to follow on introducing new variables s and t, 
where 


S=%, FS=104y), sO. X= and y= ys,7), 


This means that p = z, and cos jx. At the end, restore x = S$, yg =f. 


31.2 Riemann’s Lectures on Partial 


Differential Equations and Physics 


Riemann lectured three times on physics at Gottingen: in 1854/55, in 
1860/61, and in 1862. After Riemann’s death his former pupil Karl 
Hattendorff edited Riemann’s notes (mostly from the course of 
1860/61) and published them as a book [237].’ In the preface, he noted 
that while mathematicians drawn to the theory of partial differential 
equations took their lead from Dirichlet, as indeed Riemann had done, 
what was here was not restricted to potential theory but included a 
slew of applications. 

The book became the principal introduction to mathematical 
physics for over a generation, because it was taken up and re-edited by 
Heinrich Weber, and “Riemann-Weber’, as it came to be known, grew to 
two volumes. 


31.2.1 Riemann, Introduction to Partial Differential 
Equations 


The object of these lectures is the treatment of partial 
differential equations and their application to physical 
questions. Therefore it is convenient to make some introductory 


remarks on the relationship of the theory of partial differential 
equations to physics. 

It is well known that a scientific physics first began with the 
discovery of the differential calculus. Since one first learned how 
to follow the course of natural events continuously, research into 
the connection of appearances to abstract consequences has 
succeeded. This involves two things: first the simple basic ideas 
with which we construct, and second a method with which, from 
the simple basic laws of this construction that concern points of 
time and space, laws can be derived for finite intervals of time 
and space that alone are accessible to observations (and can be 
compared to experience). 

Galileo took the first step in respect of the basic ideas, when 
he constructed the laws of motion for freely falling bodies from 
the operation of weight at every moment of time; he found the 
law of accelerating force, the idea of a simple cause of motion. To 
this step Newton added a second: he found the idea of an 
attracting centre, the idea of a simple cause of force. With these 
two basic ideas, the idea of accelerating force and of an 
attracting or repelling centre, physics is still constructed to this 
day. The present-day speculations of Laplace, Poisson, Cauchy, 
where the thread of observations stops, are attributable only to 
the struggles with the appearance of these two laws. In respect 
of the ideas that one places at the basis of the physical 
explanation of nature, we therefore take today the standpoint of 
Newton. No new step has been taken since Newton; all research 
into basic ideas that penetrate into the heart of nature have up 
to now failed; the influence of later philosophical systems that 
have been applied in the physical literature have only had the 
success of disfiguring Newton’s original perception with 
inconsistencies. 

But the method, by which the simple basic laws for moments 
of time and space are obtained—differential equations—are 
turned into laws for finite intervals and extended bodies, is 
essentially perfected. At first, after the discovery of the 
differential calculus, one could handle certain abstract cases: in 
the study of free fall one connected the mass of a body with its 


centre of gravity, one treated the heavenly bodies as 
mathematical points, in the study of pendulums one first treated 
only the mathematical pendulum i.e. a rigid movable line 
connected to a heavy point; so that one only had to take one step 
from the infinitely small to the finite in only one dimension with 
respect to one variable, the time. But in general, in order to 
derive the experiences from the elementary laws, one must take 
the step from the infinitely small to the finite in more than one 
dimension. For the elementary laws involve space and time 
points, experiences involve extended bodies. Such problems 
lead, to speak generally—in special cases the problem can be 
simplified—to partial differential equations. 

Sixty years after the appearance of Newton’s Principia the 
first physical problem was solved that led to a partial differential 
equation. It was the one that d'Alembert showed determined the 
oscillations of a stretched string. It was then a long time until the 
general method was found by which the physical problems that 
lead to partial differential equations can be solved. For this we 
thank Fourier, who first applied such methods in his study of the 
diffusion of heat in solid bodies. This took almost as long from 
the origin of partial differential equations as that had from the 
creation of the differential calculus. Newton’s Principia appeared 
in 1687, d’Alembert’s solution of the problem of the vibrating 
string in 1747, again 60 years later, on 21 December 1807, 
Fourier presented the first part of his work on heat to the Paris 
Academy. 


After these selective and not entirely accurate historical pages 
Riemann turned to list the many areas in physics where partial 
differential equations provided the appropriate foundations. These 
included oscillations in gases, liquids, and solid bodies, elasticity of 
bodies, and optics. He noted that most of this work involved making 
assumptions about molecules that make up these bodies, and so the 
determination of the constants that enter the partial differential 
equations depended on assumptions about the molecular composition 
of bodies that, he said, we were far from having the key to being able to 
do. The same was true, he went on, for gravitation, electricity, and 


magnetism: the fundamental laws involve partial differential equations. 
He then concluded: 


What then emerges as a fact by means of induction arises also a 
priori: the proper foundations for mathematical physics are 
partial differential equations. True elementary laws can only 
occur in the infinitely small, for space and time points. In 
general, such laws will be partial differential equations, and the 
derivation of laws for extended bodies and times requires their 
integration. So methods are necessary by which the finite laws 
can be derived from the laws of the infinitely small, and indeed 
derived with complete rigour neglecting nothing. Only then can 
they be tested against experience. 


The book itself opens with just under a hundred pages of mathematical 
methods and twenty on the basics of ordinary and partial differential 
equations. Then it turns to a more detailed investigation of heat 
diffusion in solid bodies, oscillations of solid bodies, fluid motion, 
oscillations in compressible media, and finally the motion of a solid 
body in an unbounded incompressible fluid. Nothing, one notes, on 
magnetism and electricity—an omission that became the main reason 
for Weber’s new editions—but one that Riemann had addressed in 
another series of lectures, later published as Schwere, Elektricitat und 
Magnetismus (Gravity, Electricity, and Magnetism). That said, as Weber 
noted, Riemann’s book on partial differential equations was nota 
physics textbook but a mathematical book devoted to the solution of 
various mathematical problems. 


31.3 Extracts from Schwarz, “Ueber eine 
Abbildungsaufgaben’, 1869 


[At this stage in his paper Schwarz has shown that the most general 
map of an angular sector of angle z,,, to the upper half-plane that is 


given by a map that it analytic everywhere except at the origin, and 
maps the origin to itself, is one of the forms 


veal, 


f= Cvel +ayv+av +---), 
where C is a non-zero constant and the coefficients v; are all real. The 


inverse function is 
fey Q 2 
io aah Ot Rhee Ca Ti); 


where C the coefficients y; are all real. 


He then continued:] 

In a problem about conformal maps the position and absolute size 
of the figure in the u-plane on which a figure in the t-plane is to be 
represented conformally is usually unimportant. So the general 
solution of the representation problem introduces two arbitrary 
constants that determine the position and absolute size, for, if z > 1/z 


is a function that maps a figure T in the t-plane onto a figure U in the u- 
plane then wu’ = Cyu + C> is another such function, only it places the 


corresponding figure U’ in another position, is of another proportion, 


and can be dragged to the position of the figure U. So if we have to 
obtain the characteristic properties of a figure T on a figure U we must 
look at the dependence between the quantities u and t to determine 
which are independent of the particular position and absolute size of 
the figure U in the u-plane; that is, to determine the differential 
equation in whose general solution the constants mj; and m; enter as 


constants of integration. 
This leads to 


d du’ d du 

— lo = —log—. 

di dt dt” dt 
This function is then independent of the particular position and 
absolute size of the figure U in the u-plane. 


The passage from u to ae and a log au is all the more important a 
step because all the values of the argument ¢, for which the quantity ae 
becomes infinitely large or infinitely small, and £ log a infinitely large, 


are singular points for the representation problem, in that conformal 
representation in the strict sense cannot hold for them. 

In the case already considered of the conformal representation of an 
angle 2 onan angle z,, 


d du a-l 
— log = SS 4d + Ooh s 
die? dt f pata 


This function has the character of a rational function in the 
neighbourhood of the value t = (). The coefficients a,,d2,... all have 


real values and therefore the values of the function é log au, for those 


real values of the argument t for which the series converges, are 
likewise real. 

Therefore, when the problem is to conformally map the surface of a 
figure T in the t-plane onto another bounded by a simple curve (i.e. one 
that goes through no point more than once) lying entirely in the finite 
part U of the u-plane, then it is immediately assumed that the quantity 

ae can never become infinitely small or infinitely large at any point in 


the interior of T, and therefore that the function a log a has the 


character of an entire function for all values of the argument t. 
In the present case the singular values of t lying in the finite part of 
the plane are ¢ = —1, t= 0, t= +1; @ is equal to 2. The function 
j 


pees tale 1 5] 
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which for all real values of the argument likewise has real finite values, 
has the character of an entire function for all finite values of t with 
positive imaginary part, and so for all finite values of t it has the 
character of an entire function.° For the infinite value of t there is the 
development 


Ci 
wm ug=——p(L tee tee +--+), 


from which 


Regn Esa iiks: 
dt ot ot 18 , 


so the function é log au is infinitely small for all infinite values of t, and 
therefore 4 log a is a rational function of t and indeed equal to 


1 ( Mle do - +): From this it follows by integration that 


2\tt+l1 


du 1 
lore — = -- log 44(1 — t*) + log C), 
[8 7 08 ( )+ logC; 


‘ dt 
-¢, [ —S— +e 
0 V4t(1 - 7?) 


One can easily recognise that in this case the lemniscatic integral 


: { dt 
uo= ——_—— 
0 /41(1 — 7?) 
represents the interior of each of the two half-planes in which the plane 


is divided by the real axis conformally on the interior of a square with 
sides i dt___. Through the substitution » = ct one goes from the 
0 


/4t(1-72) 


half-plane lying on the positive side of the real axis [i.e. the upper half- 


plane, JJG] to the surface of a circle of radius 1 drawn in the s-plane 
around the point y = 0. 


[Schwarz then considered a number of other cases, of which I only 
translate these two. | 

If the problem is to represent the interior of a half-plane T 
conformally on the interior of a straight-sided triangle with angles 
x = a+ bi then by an analogous argument one deduces that the 


representation is provided by a function of the form 
t 
Cut+C,= { t= a)" ¢= bY “to dt. 
10 


In this case the three real values that correspond to the three vertices of 
the straight-sided triangle can be chosen arbitrarily provided they 
follow the same order on the boundary of the half-plane T the angles 

x = a+ bi of the triangle are encountered on a circuit around the 


interior of the triangle. 

With this result the form of the function is found that conformally 
represents the surface of a half-plane onto the simply connected 
surface of any straight-sided polygon. In the general case of an n-gon, 
only three of the n real quantities r = const that correspond to the 


vertices of the straight-sided polygon van be chosen arbitrarily, the 
remaining z = | are determined by the given ratios of the lengths of 


the individual sides of the polygon under consideration. 

[A little later in the paper, Schwarz remarked:] 

Concerning the problem of representing the surface of a straight- 
sided polygon on the surface of a circle I had the pleasure of seeing the 
researches of Herr Christoffel on the subject: Sul problema delle 
temperature stazionare e la rappresentazione di una data superficie, 
Annali di matematica, II© serie, tomo 1°, 1867, that seem to agree with 
mine. 


31.3.1 The Schwarz-Christoffel Transformation 


It will be helpful to give a more modern account. Consider the problem 
of finding a map (37/4, 1) from the upper half-plane onto a triangle 


with vertices at the points b,, b>, b3 and angles at these points of 

dt = dx + idz, respectively. We require that the map sends the points 
b,, b>, bs on the real axis to the points 5, bz, b3, and we look for a map 
that is holomorphic everywhere except at the points ),, bo, b3. 


The map will fail to be holomorphic precisely at the points where its 
derivative vanishes, which suggests that the map must be such that 


f@)=@= a), @—@)@—a),. 
Now, the map (a; b j) maps the origin to the origin and an interval 
around the origin to two line segments meeting at an angle of z,,,,, and 
its derivative is 
f@Qza™ 


so this suggests that when mapping the half-plane to a triangle we try 
maps for which 


Bil — Bol = (x1 — X0)(f (Xo, Yo) — (Xo; Yo))- 


Notice that away from the pre-images of the vertices the map is locally 
one-to-one, because f’ does not vanish. So we can find the image of a 


domain that does not have a pre-image of a vertex in its interior by 
finding the image of the boundary of the domain. 
Integration produces the map 


I@= Co | (z= ay)! (z = aa) (z — a3) dz + Ch. 
0 


Certainly, this map maps angular segments of A around each v; to 


angular segments of —e* around each of three points. But are they the 


points v;, and are the sides—the images of the segments joining each 
vy; to the next one (via ~, if need be)—straight? 


The first question is easy, and the answer is Yes. Consider the 
interval (—R, 0). The integral can be written as 


ay 
{ (x — ay)" = ay) "(x — a3) dx, 
a| 


which is real, and so the image is part of the real axis, and therefore 
straight. We can adjust the arbitrary constants to map qa, to a; and 


rotate the line segments around q, to point in the right directions, and 
because maps of the form C1, Co, C3,... maps line segments to line 
segments, the images of the sides through a, will be straight. We can 


do this for any vertex, to the image of the upper half-plane has straight 
sides. 

That brings us to the first question, because everything now 
depends on the lengths of the sides. In the case of a triangle this is easy, 
because a triangle is determined up to size by its angles, and the 
constant m; takes care of any potential problem. 


But what happens if we want to map the upper half-plane onto a 
quadrilateral, or more generally an n-gon with prescribed vertices and 
sides? Can it be done? It turns out that the answer to this question is 
also Yes, but the computation of the lengths of the sides of the polygon 
as functions of choice of positions of the pre-images of the vertices and 
the angles is difficult and cannot be discussed here.° For maps of the 
upper half-plane onto n-gons with given angles (x = 0) the positions, 


v1, of the pre-images of the vertices must also be specified and the 


proof that a solution can always be found is delicate. In particular, the 
length of the side with vertices f(a) and f(D) is given by the integral 


b 
{ i’ del. 


Notice that the function f(z) involves all the pre-images of the vertices. 
It can be done, and the formula that does it is the natural generalisation 
of the above integral to any number of vertices is called the Schwarz- 
Christoffel formula. 


31.4 An Extract from Schwarz, On the 
Alternating Method 


(1870) [This comes from Schwarz’s paper “On a passage to a limit by an 
alternating method” [244].| 

The rigour of the well-known inference that goes under the name of 
the Dirichlet principle, and that in a certain sense must be seen as the 
foundation of the branch of the theory of analytic functions developed 
by Riemann, is subject, as is now quite generally admitted, to very well- 
founded objections whose complete resolution to my knowledge the 
efforts of mathematicians have not yet achieved. 

By developing some enquiries, which involve a certain kind of 
representation, and part of which I have published in vol. 70 of 
Borchardt’s Journal and in the paper “On the theory of representation” 
in the programme of the polytechnic school for the Winter semester 
1869-70, I have been led to a method of proof by means of which I am 
convinced that all the theorems that Riemann used the Dirichlet 
principle to prove in his published works can be proved rigorously. 

The following report is essentially a summary of a work on the 
integration of the partial differential equation Au = 0), that I reported 


on to Herr Kronecker and some other mathematicians in November last 
year. 

It is concerned essentially only with the proof of the existence of a 
function u that on a given domain T of the independent variables x and 
y Satisfies the partial differential equation 


Ou u 
Au Ae + By? 0 

and also satisfies certain prescribed boundary and discontinuity 
conditions. 

For brevity, I restrict myself here to the case in which the auxiliary 
conditions are only boundary conditions and therefore imply that the 
function u is always continuous and takes prescribed finite values on 
the boundary of the domain T, which consists of one or more 
continuous parts. The general case can be reduced to this case by a 
method to be described. 

For the applicability of this method of proof it is in no way 
necessary to assume that the boundary curve of T has only finitely 
many corners, nor that in general at every point it has a finite radius of 
curvature, an assumption that Herr Weber and Herr Carl Neumann 
made for this purpose in their researches (see Borchardt’s Journal, vol. 
71, p. 29 and the Berichte der mathematisch-physischen Classe der 
Koniglich Sachsichen Geselleschaft der Wissenschaften, 21 April 1870). 
At no point will the tangent to the boundary curve be assumed to vary 
continuously; rather, it is enough to know that the boundary curve can 
be divided into a finite number of pieces such that in the interior of 
each piece the change in the direction of the tangent is always in the 
same sense even though it may also have infinitely many jumps, and so, 
therefore, the boundary curve can have infinitely many corners. 

Cusps on the boundary curve are also not excluded. I have carried 
out the analysis of such cusps, which arise from the contact of two 
analytic curves that have the character of algebraic curves in a 
neighbourhood of the point of contact; but to avoid unnecessary 
complications here no reference is made in what follows to the 
presence of cusps. 

The success of the proof whose basic idea is reported here rests in 
the last analysis on the following lemma: 

The boundary line of the domain T for which it is possible to 
integrate the partial differential equation Au = O with arbitrary 


boundary conditions, will be divided into a finite number of segments 
(parts). These can be arranged in two groups in such a way that each 


group contains at least one segment. One can give the individual 
segments, according as they belong to the first or second group, an odd 
or even number and denote the points that separate the segments with 
an even number from those with an odd number by P. In the interior of 
T one considers a finite number of analytic curves L that have either no 
point or only an end point P in common with the odd-numbered 
segments and are not tangents at these points. 

In this way we determine a function u for the domain T that satisfies 
the partial differential equation Au = 0 and at all points of the 


boundary of T has the value 0 or 1 according as the number of the 
segment in the interior of which the given point lies is even or odd. 
Then the upper bound, respectively, the maximum of all values that the 
function u takes on the curves L, is a positive number gq that is less than 
1. 

We now determine a function qa, that satisfies the partial 


differential equation Au, = 0 on the same domain T, with the same 


division of the boundary into odd and even numbered segments and 
the same curves L that takes the value 0 on the even numbered 
segments of the boundary, and on the odd numbered segments takes 
arbitrarily prescribed values that do not exceed a quantity g in absolute 
value; so the absolute values of the values that the function qa; can take 


on the curves L never exceeds the value gq, where q has the previously 
ascribed significance and so is less than 1. 

For the surface of the circle, and for all simply connected surfaces 
that are known to be conformal images of the circle, the integration of 
the partial differential equation Au = O with prescribed boundary 


conditions presents no difficulty. In this regard the task may be treated 
as ina paper in this journal (pp. 113-128 of the current year); there, 
breaks in the continuity for the function u in the prescribed series of 
values for the function u are exceptionally excluded; in this way the 
inference there developed can, mutatis mutandis, also be given if a 
finite number of boundary points in the series of boundary points are 
subjected to a break in continuity. 


After it is shown that for a number of simpler domains the 
differential equation Au = 0 can be integrated for arbitrary boundary 


conditions, the proof has to be found to show that for a less simple 
domain that is composed of these in a certain way the differential 
equation is also possible with arbitrary boundary conditions. For the 
proof of this theorem a limiting argument can serve that has a great 
analogy with a two-chamber air pump used to produce an evacuated 
space. The periods of the operation consist indeed in that in one and 
the other case involve two alternately operating single operations, 
which indeed have the same purpose, but are not identical in respect of 
the way and manner in which they work, but are rather in a certain 
sense symmetric (Fig. 31.1). 

Such a limiting argument may be called a limiting argument by an 
alternating method. 

Let two domains qa, and qa be given which have one or more 


domains 7” in common, and whose boundary lines are not tangent. (In 
the schematic figure 10 a, is the surface of a circle, a the surface of a 


square.) 
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Fig. 31.1 Schwarz’s Figure 10. Schwarz, Gesammelte mathematische Abhandlungen, vol. Il, p. 
136 


The totality of all parts of the boundary of a; that lie outside a; 


will be denoted Lo, the totality of all remaining parts of the boundary 


of a, that lie inside a, will be denoted Lo. 

Likewise the boundary of a, divides into the two parts Lp and Lo, 
if indeed the totality of all pieces of the boundary that lie inside aq, will 
be denoted by Lo, and the totality of all parts of the boundary that lie 
outside a; will be denoted Lo. 

It will be assumed that equally for the domains aj and q, itis 
possible to integrate the partial differential equation Au = 0 with 


arbitrary boundary conditions; it then remains to show that this is also 
possible for the domain 7, + T> — T“ = T that has the domains a; and 


@, as parts and in which their common domain 7” is counted only 


once. 
The conditions of the previous lemma are satisfied as well for the 
domain a, andthe curve Lp as for the domain qa; and the curve Lo; in 


the first case the curve Lo and in the second the curve Lg may be taken 


as the location of the group of segments of even order. It is therefore 
possible to determine two numbers q; and dq, that play the role of q in 


the lemma and are therefore both less than 1. 

To the recipients of the air pump corresponds—maintaining the 
above analogy—the domain 7”, to the interior of two pump cylinders 
correspond the domains the domains 7 — 7“, 7, — 7“, the vents to 


the curves Lo and Lo. 
On the boundary of T, thus along Lo and Lo, let the values of the 


function be given arbitrarily; let g be the upper bound and k the lower 
bound of these values; the difference g — k will be denoted G. 


Now one takes along Lo an arbitrary sequence of values, for 


example, the value k at every point of Lp, and determines for the 


domain qa, a function a, that takes the prescribed values along Lo, has 
the value k along Lo and satisfies the differential equation Au, = 0 in 
the interior of a. By the assumption about the domain aq, there is 


such a function. (First push of the first piston.) 
The values that the function a; has along Lo one thinks of as fixed, 


and determines a function a; for the domain qa, that along Lp has the 


prescribed values and agrees with the previously determined function 
a, along Lo, and for which Au, = 0. By the assumption about the 


domain q; there is such a function. (First push of the second piston.) 

The value of (—R, 0) or g = z, along Lo is smaller than ZN = OC. 

One determines for the domain a a function a; that takes the 
prescribed values along Lo, has the value a; along Lo and for which 
Au, = 0 in the interior of a@,. (Second push of the first piston.) 

At no point in the interior of a is the difference (—R, 0) negative; 
in absolute value the difference (—R, 0) is less than G along Lo but by 
the earlier lemma less that 6F'y, because (—R, 0) had the value 0 along 
Lo and along Lo it is smaller than G. 

The values that the function a; has along Lg one thinks of as fixed, 
and determines a function a; for the domain aq; that along ly agrees 
with a, and along Lo has the prescribed values and for which Au, = 0. 
By the assumption about the domain qa, there is such a function. 


(Second push of the second piston.) 
The difference (—R, 0) along Lo has the value 0, and along Lo, 


where it agrees with (—R, 0) it is positive and smaller than 6Fy; 


therefore (—R, 0) is never negative in the interior of a, and is always 
less than OF y but along Lp smaller than U = 0. 


By continuing this alternating method one obtains a sequence of 
infinitely many functions with odd and even index. The ones for the 
domain qa and the others for the domain qa, are so explained that, 


respectively, along Lo and Lo they have the prescribed values and in 


the interior of the domains on which they are explained they satisfy the 
partial differential equation Au = 0. 


For the domain 7” the functions with odd and even index are 
explained and indeed they agree with each other alternately along Lo 


and Lo. Indeed along Lo u, + vy = 0 andalong Lo uy + vy = 0. 


It is now not difficult to prove that the functions with odd and even 
index approach definite limiting functions x’ and N’ as the index 


increases, as is shown by the equations 


u’ = uy + (U3 — Wy) + (U5 — U3) +++ + Yong — Unt) + °° 


v4 


u’ = Uy + (U4 — U2) + (Ug — U4) + °° + Mondo — Urn) + °°. 


The series on the right-hand side converge unconditionally and 
uniformly (“in gleichem Grade”) for all pairs of values of , y under 
consideration, indeed 


(Uon+| = U2n-1) < G(qign)" | and 


(242 — Urn) < G(qigr)” ‘41: 
Along Lo as well as along Ly Z(6’,G). In the interior of a, Au’ = 0, in 
the interior of a; Au” = 0, therefore at every point of 7” Z(G’, B), 


because along the entire boundary of 7* both functions agree with 
each other. 


Therefore both functions x’ and N’ are values of the same function 


u, and it is explained that for the interior of the whole domain 
T, +7, —-—T* =T the partial differential equation Au = 0 is satisfied 


and takes the prescribed values along the boundary Lp + L3. 


Thus the proof of the correctness of the above considerations 
established: Under the given conditions it is possible for the domain T 
to integrate the partial differential equation with arbitrary prescribed 
boundary conditions. 

By repeated application and suitable modification of the explained 
limiting process of the alternating method the existence of a function u 
on a given domain can be established also for boundary conditions with 
discontinuities, or the prescribed discontinuities such as Abelian 
integrals possess, for which Riemann required the existence in his work 
and sought to prove using the Dirichlet principle. 

The outlined method of proof extends not only to the case in which 
the domain T is represented geometrically as a simply or multiply 
connected Riemann surface in its entire extension in the plane or the 
surface of a sphere, but is essentially unaltered also in the case in which 
this surface is spread over one or many plane or spherical surfaces and 
polyhedral surfaces. 

By means of this extension can the proof be given, among other 
things, that a simply connected domain spread over a polyhedral 
surface can be conformally represented on the surface of a circle if this 
surface has a closed boundary curve and on the surface of a sphere if it 
is a simply connected and closed domain. 

In this way the question is answered of the possibility of 
determining the constants to which the conformal representation of a 
simply connected covering surface of polyhedron bounded by plane 
figures on the surface of a sphere can be reduced (see Borchardt’s 
Journal, vol. 70, p. 119). 

A special case of the just-mentioned problem occurs when it is 
required to map a simply connected surface in the form of a plane 
figure with polygonal boundary conformally onto the surface of a circle, 
where the surface of the polygon may lie entirely in the finite or contain 
the infinitely distant point once or several times in its interior; branch 


points in the interior are also not excluded. For this problem the sole 
difficulty consists in the proof of the possibility that on a certain 
number of parts real and on some parts complex conjugate constants 
can be determined, upon which the function providing the conformal 
representation depends, so that all the conditions of the problem are 
satisfied. 

This difficulty can be overcome by the method that Herr 
Weierstrass has developed. The application of the above limiting 
process offers a new way to overcome it. 

Similarly, the proof is provided of the possibility of determining the 
constants to which the problem of the conformal representation of a 
simply connected figure bounded by circular arcs upon the surface of a 
circle is reduced. 

[The later Nachtrag, a dispute with Christoffel on conformal 
representation, is not translated. | 


31.5 Schwarz on the Hypergeometric Equation 
(1873)—A Summary 


Report on those cases in which the Gaussian hypergeometric equation 
ad — bc # 0 is an algebraic function of its fourth element. 


Schweizerischen Naturforschenden Gesellschaft, 1871, 74-77, (session of 
22 August 1871), in Gesammelte Mathematische Abhandlungen 2, 172- 
174. 

The problem of studying when a given (ordinary) algebraic 
differential equation has a particular algebraic solution, and, if this is 
the case, of finding all its particular algebraic solutions, still belongs 
today to the most difficult problems in analysis. It seems, in the present 
state of knowledge, that the problem must be tackled in isolated cases 
with the help of such methods as are appropriate to the special cases 
under consideration. 

For the second-order linear ordinary differential equation that the 
Gaussian hypergeometric equation ad — bc # 0 satisfies, considered 


as a function of its fourth element, the following train of thought leads 
to a complete solution of the given problem. 


The general solution of the given differential equation can, as it is 
easy to see, only be an algebraic function of the independent variable x 
if the first three elements a, (, y are real and indeed rational numbers. 


If on the assumption that these conditions are met, one considers in 
addition to the general solution of the differential equation the 
quotient of two linearly independent particular solutions, then this 
latter is related to the general solution in such a way that either both 
are algebraic functions of the argument x or neither of the two 
functions depend algebraically on the quantity x. 

The independent variable x is an unrestricted variable quantity that 
can take all real and complex values. If one now thinks of the plane, 
whose points represent the values of the complex quantity x 
geometrically, as divided by the real axis into two half-planes, and 
considers the conformal representation that is provided by a branch s 
of the above-mentioned quotient as a function of the complex variable 
x, then there corresponds to each of the two half-planes a figure whose 
points represent geometrically the values of the complex quantity s; 
one, generally, a domain bounded by three circular arcs and which can 
therefore be called a circular-arc triangle. 

By analytic continuation of the branch of s under consideration 
there arise in general infinitely many circular-arc triangles in the plane 
of the complex variable s, and indeed every neighbouring two of them 
have a side in common. If, in a special case, this side is straight, then 
both triangles correspond one to the other in the usual way, i.e. they are 
symmetrical figures with respect to the line. If however as a 
consequence of the development the common side is a circular arc— 
and this is the general case—then in place of the usual symmetry a 
Mobius circular transformation occurs, and indeed the circle which the 
common arc belongs, is the directrix of this transformation. This 
relationship can rather be called symmetry with respect to an arc. 

Through these considerations the problem that is to be solved, that 
was originally function-theoretic, reduces to the following geometric 
one: Find all circular-arc triangles that on being multiplied by this 
symmetry law occupy only a finite number of positions and have the 
form of various different circular-arc triangles. 


Through geometric arguments one now finds that the number of 
different symmetric repetitions of a circular-arc triangle can only be 
finite when it is possible to map this triangle conformally onto the 
surface of a sphere so that it corresponds to a spherical triangle. Since 
now for a spherical triangle all corresponding repetitions are either 
symmetric figures in the strict sense or congruent figures, in this way 
the question is reduced to the following purely geometric problem: 

“A body has only a finite number of symmetry planes: find all of the 
different positions these can have”. 

This problem, already solved by Steiner, leads either to a family of n 
planes with a common axis, in which each plane meets the next at an 
angle of 2/2, and with a plane that cuts each plane of this family at 


right angles, or to the symmetry planes of a regular polyhedron. 
This is connection of the question “When is the general solution of 
the differential equation of the hypergeometric series ad — bc # 0an 


algebraic function of the argument x?” with the theory of regular 
polyhedra. 

The case in which not the general but only one particular integral of 
this differential equation is an algebraic function of the argument x—a 
case that is easy to analyse—can be set aside here. 


31.6 Darboux on the Solution of Riemann’s 
Equation (1887) 


We now follow the account given in Darboux ([58], vol. 2, Sects. 358- 
360).’ 

Section 358 The adjoint equation to a given linear equation was 
presented for the first time in a memoir by Riemann on the propagation 
of sound. [...] In what follows, we discuss only the equation, already 
studied in the previous chapter, 


[where a, b, c are functions of x and y]. We shall give Riemann’s results 
and indicate the consequences that one can deduce. One has® 


— —+b 
(z) And ur yo 
Ou Ou Ou Oa Ob 
= Fy On Oy On Oy 

M — + 1 a — Ou 

— Z “Oy “ay 

1/ dz Ou 

N = bua + 5 (use - 25"), 


Riemann showed that if S is a region bounded by a simple closed curve 
C then 


{ [ur — zG(u))dxdy = [ (atay — Ndx). (31.30) 
S o 
Suppose that z and u are, respectively, some solutions of the given 
equation and its adjoint, so 

F(z) =0, G@w=0. 


Then the integral over S in Eq. (31.30) vanishes, and we therefore have 
[tas — Ndx) = 0. 
o 


Let A be an arbitrary point in the plane and B’C’ acurve placed 


arbitrarily in the plane. Draw through A the lines AB and AC parallel to 
the coordinate axes, and suppose that the solutions z and u and the 
coefficients of the differential equations and their first derivatives are 
continuous in ABC. The above equation gives 


C B A 
{ Mdy+ [ (Mdy — Nax) ~ | Ndx = 0 
A C B 


Insert the above values of M and N into this equation, and one gets 


GC C1 Ouz Ou 
my= [°(S20-2{0" ai), 
J 4 a \2 dy Oy 
1 
[onav= [° (Geax - (Fe — bua}, 
Ox 


If quite generally one denotes by m, the value of a function y at a point 


2 1 © (A 
{. Mdy = quate — (uZ)a) - {. Z & - a dy, 
A A y 
B B 
{. Ndy = ((u2)s — (uz)a) - i Zz & — bu dx 
A A x 


1 e Bau C du 
(uz)4 = 5 (uz)at(uz)c) = ii (Mdy — Ndx)- ‘ Zz (= - ou) dx - { Zz (= - a dy. (3 1.3 1) 
B A x A oy 


P, one has 


SO 


We examine each term on the RHS. 

We imagine that, with Riemann, we have to find the solution z of the 
given partial differential equation that takes given values, along with 
one of its derivatives, at all points of the curve B’C’. The equation 


Oz Oz 
dz = —dx + —dy 
Ox Oy 
applied to a displacement along the curve evidently determines 
whichever one of the partial derivatives that was not given a priori, so 
we can consider that they are both known at each point of the curve 


B’C'. It follows that if one has chosen a solution u of the adjoint 


equation, the three terms 


GC 
(uz)B, (UZ)c, { (Mdy — Ndx) 
B 


that enter the RHS of Eq. (31.31) are known and depend only on the 
bounding conditions on z. If one could calculate the latter two integrals 
on the RHS one would know 4p, that is to say the value of z at an 


arbitrary point of the plane. Now, in general these integrals depend on 
the entirely unknown values that the sought-for solution z takes on the 
line segments AB and AC. For these values not to intervene it is 
necessary that the solution u has been chosen so that 


O 

ise bu=(Q _ everywhere on AB 
Ox 

O 

~ —au=( everywhere on AC. 
y 


If these two equations can be satisfied, then the fundamental 
Eq. (31.31) reduces to the following: 


b b b 
{ Fode~ f F(¥,)dx =i EQ, p)dx, (31.32) 


which determines the value of z at an arbitrary point of the plane as a 
function of the boundary conditions only. 

Thus, to obtain the general solution of the equation in the form most 
appropriate for problems in mathematical physics, it is enough to find a 
solution of the adjoint equation that satisfies the two equations stated 
above. 

These conditions can be transformed as follows. One must have 

a — by = 0 everywhere on AB. Because only x varies on this segment, 


one can integrate this equation, which gives 


M 
Um = Ua exp ([ ba 
A 


for all points M between A and B. Likewise, the second condition can be 


replaced by 
N 
Un = Ua CXP ([ ads 
A 


for all points N between A and C. 
One can always reduce the constant a to unity, so, if the 


coordinates of A are (Xg, yo) the question is reduced to finding a 
solution u(x, y, Xo, Yo) of the adjoint equation depending on two 


parameters (7, 0), that reduces to unity for dx + a(x, y)dy, taking the 
value exp( if : bdx) for Z, = 0 and the value exp( f ; ady) for L(x, y). 


This is the fundamental result established by Riemann. The great 
mathematician had been able to determine a function u for the 
equation that he had discussed and which is no other than equation 
a + da. We shall see that the determination of this function can also be 


carried out for the more general equation E(G, (’), but first staying 


with the general theory we shall add an essential remark to the result 
we have just given. 

Section 359 Suppose that the primitive curve BC reduces to two 
straight lines 8, 6’ and DB parallel to B’D and £, f’ [the y and x axes, 


respectively], and let (xo, yo) be the coordinates of the point D. One will 


B D B 
{ (Ndx — Mdy) = i Ndx - { Mdy. 
C C D 


Moreover, one can write 


have 


” Pili az Ou 
[_ Nax= } (5 use 252) + buz) ax = 


and so 


[ 1 [ (= | 
Ndx = = ((uz)c — (uz)p) + u|— + bz} dx. 
Cc 2 é Ox 


Likewise one has 


B 1 PD (dz 
{ Mdy = = ((uz)g — (uz)p) + { u & + «| dy. 
D 2 B Oy 


Therefore, substituting these values in the above equations, one has 


D Oz B Oz 
(uZ)p = { “(5 + | dx — { (5 + a dy. (31.33) 


This formula applies to every solution z of the given equation. It offers 
the greatest analogy with the general Eq. (31.32), but it is distinguished 
by an essential property. Indeed, one recognises that it is not now 
necessary to prescribe one of the derivatives of z on the contour C’DB’ 


. Knowing only the values of the sought-for solution on the lines /, 6’ 


and B’D allows one to calculate the two integrals that the preceding 
formula contains and to obtain the value of this solution. It is necessary 
to look for the origin of this very interesting result in the circumstance 
that the new contour is formed with the characteristics of the given 
linear equation. 

Suppose now that one takes for z this particular solution 
y = m(z)x + z of the given equation that is given by entirely similar 


conditions to those indicated for z,, = P(x, y) considered as a solution 


of the adjoint equation. As it is necessary to change the sign of the 


coefficients a and b when one passes from one equation to the other, 
one sees that this solution must reduce 


for y = y; to exp (- f bas] 
x] 

for x = x; to exp (- [ ay 
yl 


and consequently to 1 for (x,y), Z, = 0. Therefore one has 


- + bz = Oat all points of CD 
& + az = Oat all points of BD 
—_— 1 at D. 


Consequently, Eq. (31.33) reduces here to m — oo, that is to say 


2(X0, Yo3 X15 Y1) = U(X1, 13 Xo, Yo): 


This equality implies the following proposition: The solution 
u(X, Y; Xo, Yo) of the adjoint equation that we defined before can be 


considered as a function of the parameters (7, 0); it is then a solution of 
the primitive equation (where one will have replaced x, y by (7, 0)) and 
possesses, with respect to that equation and the variables (z, 0), the 


properties by which is has been defined as a function of the variables 
x, y and is a solution of the adjoint equation. In other words, the 
definition of u does not change if one switches the linear equation and its 
adjoint, on condition that one switches the two systems of variables x, y 
and (z, 0). 


It follows that the determination of this function u(x, y; x9, yo) also 


allows one to integrate the adjoint equation by a formula analogous to 
what has been given above. The integration of two linear equations, the 
given one and its adjoint, therefore lead to one and the same problem, the 


determination of the function u(x, y; Xo, Yo) . This function can be 


completely defined, either as the solution of the given equation, or as the 
solution of the adjoint equation, via the boundary conditions to which 
they are subjected. 


31.7 Picard and Elliptic Partial Differential 
Equations (1890) 


Picard showed in his [211] that these equations have regular solutions, 
just like the Laplace equation. The paper opens as follows: 

Zxx = P(x, y) 

Let us consider a second-order partial differential equation ofthe 
form 


A—+2B ~ 
Ox? : OxOy . Oy? 


Oru Oru Oru | Ou Ou | 

SPU 5 ay 8 (31.34) 
A, B, C depending only on the two independent variables x and y. In 
order to solve this equation under certain specified boundary 
conditions, one can proceed by successive approximations in the 
following manner. In the second member we insert an arbitrary 
function a, of x and y, and form the equation 


(here putting, for brevity, ,,,— 424 p&u 4 C#u). Let us suppose 
Ox? Oxdy Oy? 


that we have solved this equation for a; and provided certain boundary 


conditions that, we suppose, completely determine an integral that we 
denote by a ;. One then forms the equation 


and solves it for a; under the same boundary conditions as above, and 
continue in this fashion indefinitely. If the solution a; tends toa 


definite limit u as n increases indefinitely on then obtains the solution u 
of Eq. (31.34) that satisfies the given conditions. 

These generalities only have interest when one can make the 
boundary conditions precise and put in place conditions that allow us 
to establish rigorously the convergence of a, to the limit u; this is the 


point of this memoir. We make the essential supposition that in the 
region of the plane containing the point (x, y) the discriminant K = —k? 


does not change its sign. Consequently, we can reduce our equation to 
one of the two following types: 


Pu Fu _ Fae 31.35 
a2 dy ax ay PY Ce 
Fu = Lae 31.36 
Oxdy — Ox’ dy’ a hi oreo) 


for which the problems posed are entirely different. 
For equations of the first form we provide as boundary conditions 
the values of the functions a; along a closed contour C, and we require 


them to remain continuous along with their first two differentials 
inside the contour. The study of a; shows that it converges to a limit 


provided that C satisfies certain conditions that, in particular, are 
satisfied when the contour bounds a sufficiently small area. In this case 
one obtains a solution of Eq. (31.35) that takes a given continuous 
succession of values on the contour. This solution, as we shall show, is 
moreover unique if the equation is linear, when the contour is 
sufficiently small. 


One cannot affirm in general that the solution is unique when F is 

not linear in , 24,and 2. 
2 Ox Ox 

In the case of Eq. (31.36), the boundary conditions must be taken in 
an entirely different manner. Here we take an arc of a curve C, and along 
C we prescribe the values of aa and = as well as the value of a; ata 
point A of C. Let B be a second point of C, such that the coordinates of a 
point M of the curve constantly vary in the same sense as M goes from A 
to B; consider the rectangle parallel to the axes of which A and B are 
opposite vertices. If B is sufficiently close to A, a; will tend to a limit u 


for all points of this rectangle, and one will have the solution u of 
Eq. (31.36) that takes a given value at A and for which a and a take a 
Xx Xx 


given continuous succession of values on the arc AB; u and its two first 
partial derivatives are continuous functions of x and y as one traverses 
the arc AB. 

The theorems indicated above for Eq. (31.35) are only correct if the 
contour C encloses a sufficiently small area. It is very interesting to find 
equations where without restriction a solution that is continuous 
together with its partial derivatives will always be determined by its 
values on an arbitrary closed contour. 

One can give some detailed examples. This will happen for the 
Eq. (31,35) 


Ou Oru | Ou Ou | 


—_  —_ = a 
Ox dy? Bx’ a 


if, on replacing oe by vand oe by w one has the inequality 
Xx Xx 
ie aC ea Ce 
whatever u, v, w may be. In particular, if F depends on neither a nor 2 
x X 


, this condition will be satisfied. 


We shall make a special study of the case where the equation can be 
written as 
Oru . Oru F( 
oD ane U, Xx; 9 
Ox? Oy? 
and F increases continually with u. 

Supposing first of all that F is always positive, we shall see what our 
method of successive approximations can give here. Its use leads to a 
very curious result. This method leads not to one limit, but to two limits 
u and v. These functions take the given values on the contour and 
satisfy the two equations 


Au = F(v, x,y), Av = F(u, x,y). 


In order that the problem of finding a solution to the given equation 
and taking the given values on the contour can be solved, it is necessary 
that y = 0; this will not be the case for an arbitrary contour, but this 


identity is verified if the contour is sufficiently small. 

In this particular case we shall show in what follows that one can 
pass to an arbitrary contour. In fact, the problem being treated for two 
contours having a part in common can be solved for the bounding 
contour exterior to the two areas. The alternating process, that M. 
Schwarz and M. Neumann used in their memorable works on the 
Laplace equation Au = O, can, with modifications that are in any case 


quite obvious, be extended to our general equation, and, as a result, 
prove completely effective in the study of the integral, which moreover 
is unique, of the equation 


A=Oorl-y. 


that takes a given continuous succession of values on an arbitrarily 
closed contour. 

I also consider an interesting case in which the function F, which 
always increases with u, vanishes for x = 0. 


The solutions considered up to now are continuous inside the area. 
Taking in particular the equation 


Pu Pu : 
at aor A(x, ye’, 
I examine the case where the integral has logarithmic singular points, 
and I particularly direct my attention to the following equation which is 
of great interest, in geometry as much as in analysis, 
2 2 
= + ae ke", 
Ox* dy? 
where k denotes a positive constant, and which one can call Liouville’s 
equation. I deepen the study of the solutions of this equation by 
considering them in the whole plane and by studying especially those 
which are continuous in the whole plane with the exception of a certain 
number of logarithmic singular points for which one regards the 
corresponding coefficients as given (only satisfying certain 
inequalities); I draw attention here to the following result: These 
solutions depend only on an arbitrary constant, and a solution of this 
kind is determined when its value is given at a point of the plane distinct 
from the singular points. 

Having studied these solutions in the ordinary plane, I extend this to 
the multiple plane, that is to say to the plane covered by a certain 
number of leaves forming a Riemann surface. 

I remark, in ending this chapter, that the above results relative to the 
equation 


Au = ke" 


are not without interest for the theory of Fuchsian functions. I shall 
come back to this special application of the general theory that I have 
tried to develop in this memoir. 

In a final chapter I apply to ordinary differential equations the 
approximation methods that I have used. It is particularly interesting to 
consider a system of differential equations of the form 

da 
“> = Jil Xe Vis yes ce Ym)» 
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and to obtain system of solutions for these equations continuous 
between x = 0 and x = 0 and taking given values at these extremes. 


= Fal XY 5 ¥23 eee Vins 


The same considerations apply to a system of partial differential 
equations of the form 
O7uy Oru; 
a Voth Woe nca tha) 
Ox2 Oy? fil y. 1 2 ) 


Oru, Oru, 


Ox? 7 Oy? 


SiG, y, Uy, U2,..., Um). 


Ouy, . OU 
Ox? Oy? 


= Til Kay, uj, U2, #8 (5 Min) 
By imposing certain very general hypotheses on the fone can 
determine a system of solutions 4), U2,...,U,, taking given values ona 


contour. [Extract ends. ] 


31.8 Picard and Hyperbolic Partial 
Differential Equations (1890) 


Picard showed that these equations can be solved with initial 
conditions on suitable arcs. 


Among the equations of the type to be considered, it will be enough 
to consider the equation 


oe = us + rns + 
Oxdy — “Ox Oy ms 


because once this equation has been dealt with it will be enough to 
repeat the same arguments as above to deal with the more general 
equation 


Pep, oe a , 
Oxdy — i Ox’ dy’ |? 


There can be no question here of finding the solution in terms of its 
values along a closed contour; we have to study the method of 
successive approximations with a view to determining the general 
integral. 

Let us consider in the (x, y)-plane the arc of an arbitrary curve C for 
which we suppose only that either of the coordinates is a function of 
the other, and always increases in the same sense. We want to obtain a 
solution of this equation for which the partial derivatives a and - 
take a given succession of values on C and which itself takes a specified 
value at A on C. 

One first tackles the problem for the equation 

Oz 


Oxdy — 


Let y; be a solution, one must then consider the equation 


O22 get pels 
OxOy Ox Oy a 


One looks for a solution on which = and = vanish on C and jy; itself 
Xx Xx 


vanishes at A. 
One then considers the equation in y; 


Os qo + po +¢ 
Oxdy Ox oy o 


which one solves under the same conditions, and continues in this way 
indefinitely. 
It is now necessary to study the series 


ZtZgtert mp tee (31.37) 


and to see if it gives a solution of the problem stated. [...] 
Let us remark right away that the solution of the equation 
07z, 


OxOy — 


is immediate if on C one gives ce as a function of x and a as a function 
x y 


of y. Let y(a) and y(a) be these two functions. One will evidently have 


Xx y 
Zi = Zot { p(x)dx + { W(y)dy, 
x YO 


0 


where A has coordinates (Xo, yo). 


On the other hand, let the equation be 


0°z 


OxOy 


= F(x, y), 


where F is a continuous function of x and y. The solution of this 
equation that vanishes at A and for which a and S vanish on C can be 
represented in the following way: let P be a point with coordinates 

(x, y), and draw through this point parallels to the x and y axes meeting 
the curve C at points M and N: the required solution is given by the 
double integral 


7 { { Fé.mdédn 


taken over the curvilinear triangle PMN. 

One assumes, as | have already said, that from A to B on the arc C 
either of the coordinates is a continuous function of the other and 
always varies in the same sense (increasing in the case of the figure). 


y 


Fig. 31.2 The curve C and the point P 


This done, suppose that the point P lies in the rectangle ABA’ B’ 
[see Fig. 31.2], and let AB’ = a and BB’ = f; we are going to look for 


upper bounds on the different terms of the series (31.37). 
Let us denote by M the maximum value of az 4 b= + cz, inthe 
x ry 
rectangle ABA’B’. One then has 


Oz 


Ox 


£2 


lz] < Map, 
Oy 


If moreover the maximum absolute values of a, b, and c in the rectangle 
are A, B,and C, respectively, then in the rectangle 


Oz Oz 
a— +b—+ cz 
Oy 


< M(Aa + bB + Cap). 
Ox 


Consequently, 


Iz3| < M(Aa + bB + CoB)af, S < M(Aa + bB + CaB)pB 


Continuing thus, one arrives in general at 
zn < M(Aa + bB + Cap)” ‘a8. 


It follows that the terms of the series (31.37) can be compared witha 
geometric progression; if, therefore, 


Aa + bB+ Cap <1 (31.38) 
then the series (31.37) and the two series 


Ox Ox Ox , 


O O 

26) 2(B)-8 

Ox\W/) OdOy\W 
will converge. As for condition (31.38), it will evidently hold if the point 
B is sufficiently close to the point A. The series converges inside the 


rectangle ABA’B’. The function z, the limit of the series 
Zp +Z2+++++2Z,+-:: will obviously have first partial derivatives and 


the second derivative _07z ; furthermore it satisfies the equation 


Oxoy 
Fe = oe + ne + 
Oxdy — “Ox Oy a 


Thus, under the above hypotheses, we have for the given partial 
differential equation a solution z that takes a given value at the point A 
on the curve, and for which the partial derivatives m and a take, 

x y 


respectively, on Ca prescribed continuous succession of values. These 
functions y(a) and ¢(qa) in our analysis are subject only to the single 


condition of being continuous. Let us remark that z, x , and a are 
x y 


continuous functions of x and y even when one crosses the arc C; here 
there is an interesting point in the theory of partial differential 
equations that it good to insist upon. 

A solution z of a linear second-order partial differential equation is, 
one says in a general way, determined when one prescribes the values 
of z and a on a curve C, or, which comes to the same thing, the values 


of oe and a on this curve and the value of z at a particular point of C. 
x ly 


But this general conception is only valuable for a curve C traced ina 
region of the plane where the characteristics are real, that is to say, only 
in this case, when one is certain to have a solution satisfying the given 
conditions that is continuous along with its first-order partial 
derivatives when one crosses the arc C; our preceding analysis shows 
very neatly that z, x and rs are continuous in the passage over C. 


It is quite otherwise when the characteristics are imaginary. To see 
this, it suffices to take the simple example of the equation 
On n O7z _9 
0x2 dy? 
In general one cannot have a solution of this equation that is 
continuous in the rectangle ABA’B’ along with its first-order partial 
derivatives, and for which x and & take on the arc AB the succession 
x y 


of values denoted above by y(a) and (a), these functions being 


subject to no other condition than being continuous. In the contrary 
case one could, in effect, form an analytic function p = K that will be 


holomorphic in the rectangle under consideration, the real part of this 
function being arbitrary on the curve AB, which is impossible because a 
holomorphic function determined on an arc of a curve however small 
can only be extended in a unique way. 

Thus the proof that we have given of the existence of a solution of 
the equation 


and its development in series allows us to raise a question that we must 
necessarily put on one side, when one supposes that a, b, c are analytic 
functions and that the conditions on the bounds are expressed by 
means of analytic functions. 
The linear equation 
Oz Oz 


= a— ip +¢ 
OxOy Ox Oy “ 


has been the object of a remarkable chapter in Darboux’s Lecons sur la 
théorie des surfaces (Vol. II, Chap. IV). Following an idea of Riemann’s, 
Darboux reduced the solution of this equation to the study ofa 
particular solution z; this solution z is determined by the condition that 
it reduces for p(x, y) to a given function y(a). Darboux established the 


existence of such a solution in supposing that a, b, c are analytic 
functions of x and y, and he uses as an intermediary the celebrated 
equation considered by Euler and Poisson. Staying with our point of 
view of successive approximations, the proof of the existence of sucha 
solution z is quite easy, without making any other assumption about 
ay + bw,and w other than that of continuity. [Evidently one has 


(xo) = Wyo), and one assumes that ~ and w have first derivatives. | 


[Here the extract ends. ] 
Picard then sketched the slight modifications of his earlier proof 
needed to adapt it to the new situation. 


Footnotes 


1 See the comment at the end of the translation. 


2 Cauchy used the same subscript notation for the initial conditions that he had used for the 
new variable, but the confusion this causes is slight. 


3 Cauchy mistakenly wrote B,C,... for doy = 0. 


4 Ihave used the third edition, 1882, which Hattendorff said is a careful revision of the first 
edition. 


5 This follows from the Schwarz reflection principle that Schwarz had introduced earlier in his 
paper. 


6 For acomplete discussion, see Nehari ([204], 189-198). 


7 See also (Courant-Hilbert Vol. 2, Ch. 5, Sect. 5). 


8 If we regard F as an operator, then we regard G as the adjoint operator. 
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Appendices A to G: Historical support on Newton’s Principia 
Mathematica, mathematical support on the method of characteristics, 
first-order non-linear equations, Green’s theorem, complex analysis, 
Mobius transformations, and the methods of Lipschitz and Picard. 
Appendix H: Assessment questions and how to tackle them. 


Appendix A 


Newton's Principia Mathematica 


Newton’s theory of celestial mechanics, set out in his Principia 
Mathematica (Fig. A.1) [206], is based on his analysis of the concept of 
motion and its causes, a thorough-going mathematical analysis of the 
motion of bodies under the action of forces, and a meticulous study of 
the observed motion of the Moon, the planets, and their satellites. This 
led him to proclaim his highly novel theory of gravity, and to refute the 
earlier and widely accepted ideas of Descartes ’s. A notable success on 
the way was his integration of Kepler’s laws, and his study of the 
motion of a body under a central force. He came up with a remarkably 
accurate description of the motion of the planets and their satellites 
based on his inverse-square law of gravity, but it nonetheless failed to 
account for the motion of the moon, and this was a cause of great 
controversy the generation after his death. 


A.1 Newton's Laws of Motion in His Principia 

Newton’s Principia was published in late 1687. It is a book of 547 pages, 
written in scholarly Latin, and after some introductory remarks and a 
few definitions it opens with these three laws of motion.! 


Fig. A.1 Title page of Newton’s Principia 


Law 1 Every body perseveres in its state of being at rest or of 
moving uniformly straight forward except insofar as it is 
compelled to change its state by forces impressed. Projectiles 
persevere in their motions, except insofar as they are retarded 
by the resistance of the air and are impelled downward by the 
force of gravity. A spinning hoop, which has parts that by their 
cohesion continually draw one another back from rectilinear 
motions, does not cease to rotate, except insofar as it is retarded 
by the air. And larger bodies - planets and comets — preserve for 
a longer time both their progressive and their circular motions, 
which take place in spaces having less resistance. 

Law 2 A change in motion is proportional to the motive force 
impressed and takes place along the straight line in which that 
force is impressed. If some force generates any motion, twice the 
force will generate twice the motion, and three times the force 
will generate three times the motion, whether the force is 
impressed all at once or successively by degrees. And if the body 


was previously moving, the new motion (since motion is always 
in the same direction as the generative force) is added to the 
original motion if that motion was in the same direction or is 
subtracted from the original motion if it was in the opposite 
direction or, if it was in an oblique direction, is combined 
obliquely and compounded with it according to the directions of 
both motions. 

Law 3 To any action there is always an opposite and equal 
reaction; in other words, the actions of two bodies upon each 
other are always equal and always opposite in direction. 
Whatever presses or draws something else is pressed or drawn 
just as much by it. If anyone presses a stone with a finger, the 
finger is also pressed by the stone. If a horse draws a stone tied 
to a rope, the horse will (so to speak) also be drawn back equally 
toward the stone, for the rope, stretched out at both ends, will 
urge the horse toward the stone and the stone toward the horse 
by one and the same endeavor to go slack and will impede the 
forward motion of the one as much as it promotes the forward 
motion of the other. If some body impinging upon another body 
changes the motion of that body in any way by its own force, 
then, by the force of the other body (because of the equality of 
their mutual pressure), it also will in turn undergo the same 
change in its own motion in the opposite direction. By means of 
these actions, equal changes occur in the motions, not in the 
velocities - that is, of course, if the bodies are not impeded by 
anything else. For the changes in velocities that likewise occur in 
opposite directions are inversely proportional to the bodies 
because the motions are changed equally. This law is valid also 
for attractions, as will be proved in the next scholium. 


Newton’s laws of motion are stated as axioms, and accordingly neither 
derived from other statements nor based on experiments. Newton gave 
an explanation and elucidation of each law, but not a justification. And, 
as befits axioms, the laws are the basis for subsequent deductions 
concerning the behaviour of moving bodies and bodies acted upon by 
forces. Newton’s laws state presumed properties of matter in motion; 
they are not specifically mathematical, being neither geometric nor 


algebraic. They are not stated in the form of equations involving 
symbols and their manipulation. 

In a book devoted to the study of motion in general and planetary 
motions in particular Newton had to decide what were the crucial 
astronomical ideas that he needed. He chose to rely on all three of 
Kepler’s laws, and Newton’s vast generalisation of Kepler’s somewhat 
controversial second (equi-area) law, prominently placed near the front 
of the book, was to play a vital role in the theory he presented (see 
Sect. A.1.1). 

The Newton scholar I.B. Cohen has observed that’: 


It was an unusual and a very daring step to erect an 
astronomical system encompassing Kepler’s three laws, as 
Newton did. Following the imaginative leap forward that 
Newton made, in showing the physical meaning and conditions 
of mathematical generality or applicability of each of Kepler’s 
laws, this whole set of three laws gained a real status in exact 
science. 


A.1.1 The Content of the Principia 


Before Book I begins, the Principia has an introduction in which 
Newton spelled out the mathematically precise concepts used in his 
three laws of motion. He defined the quantity of matter and the 
quantity of force, and he discussed forces of various kinds. He then gave 
a complicated distinction between relative and absolute motion and 
relative and absolute time: in many ways Newton treated all motion as 
relative, but he also regarded the centre of the universe (which he 
regarded as the centre of the solar system) as being absolutely at rest. 
Only then come the three axioms or laws of motion we looked at above, 
and the first of their elementary consequences. 

This book then gives a long, careful, cumulative discussion of “the 
method of first and last ratios of quantities”: a geometrical study of 
curves and their tangents in the spirit in which Newton conducted his 
investigations of the calculus. 

Then Newton turned to a study of the motion of a point under a 
centripetal force. He showed that the line joining a fixed point to a 
moving one sweeps out equal areas in equal times if and only if the 


force on the moving point is directed towards the fixed point. 
Remarkably, the size of the force can depend in any way on the length of 
the radial line; the orbit can be any shape determined by the law, not 
just a circle or an ellipse. 

Among the special cases that are then worked out is this one: if the 
moving body traverses a conic section under a centripetal force 
directed towards one focus, then the magnitude of the force is inversely 
proportional to the square of the distance. Newton knew well, as his 
acceptance of Kepler’s laws indicates, that this is the relevant case in 
astronomy. In the first edition of the Principia he also stated the 
converse (it is Cor. 1 to Prop. 13): under an inverse-square law, bodies 
move in curves which are conic sections having the centre of force as 
one focus. Controversies surrounding this statement led Newton to 
enrich it with a skeleton proof in the second edition (1713). 

To find where in its orbit a planet can be found at any particular 
time Newton used Kepler’s equi-area law. He also showed that if 
planets traverse ellipses under the action of a force that obeys an 
inverse-square law, then they necessarily obey Kepler’s third law (the 
3/2 power law, which says that if r is the radius of the orbit of a planet, 
and t is the time to complete one orbit, then 7? « r?). 


Newton then investigated the attraction between solid bodies 
under an inverse-square law. He established that a spherical shell 
exerts no force on a point inside it and attracts a point outside it in the 
same way as a point mass concentrated at the centre. This remarkable 
result, which surprised him as much as his contemporaries, enabled 
him to reduce the study of large spherical objects like planets to the 
study of points and centripetal forces, which he had already described. 
In Newton’s theory of gravity, large solid spheres may be replaced by 
points (of the same mass)—a considerable simplification in the theory. 
For much work in astronomy, the assumption that planets and the sun 
are spherical in shape is entirely reasonable. 

Newton discussed many topics in Book II, subtitled “The Motion of 
Bodies (in resisting media’, but we need to note only that at the end of 
the book Newton demolished Descartes ’s theory of motion in vortices 
and concluded: “Hence it is manifest that the planets are not carried 


round in corporeal vortices”.° 


In Book III, “The System of the World (in mathematical treatment)” 
Newton demonstrated that the theory of an inverse-square law for 
gravity acting as a force between bodies can explain the motion of the 
planets and of their satellites, the motion of comets, and enable the 
shape of the Earth to be determined. But the Moon gave him trouble 
and could deduce only that its motion obeys the equi-area law. 


A.2 The Motion of the Moon 


The motion of the Moon is far from simple, and it has been intensively 
studied not only because it is our nearest neighbour in space but 
because, if it could be understood accurately, it would provide an 
excellent clock and so be an aid to navigation. 

The problem is that the Moon is part of a system of three bodies: the 
Earth, the Moon, and the Sun, and while Newton could deal well with 
two bodies acting on each other by gravity, the three-body problem, as 
it became known, is (strictly speaking) unsolved to this day. No one can 
yet answer the question: given three arbitrary bodies acting on each 
other by gravity and released initially with such-and-such velocities, 
what will be their orbits for all future times? Will the Moon always orbit 
the Earth, move away, or eventually collide with it? We do not know. 

But if the mathematical problem is too difficult to solve exactly, 
exhaustive computer simulations enable scientists to navigate satellites 
around the solar system with astonishing accuracy and to make a 
variety of predictions about the long-term fate of the solar system (and, 
equally interesting, about its origins). These predictions say that the 
Moon will still be orbiting the Earth 50 million years from now. 

In the seventeenth and eighteenth centuries conclusions could only 
be reached if some simplifying assumptions are made. Newton 

assumed, for example, that the only effect of the Sun is to perturb 
slightly an otherwise elliptical orbit of the Moon around the Earth, and 
then tried to calculate that perturbation exactly (as measured by the 
motion of the apse of the ellipse). His success was less than complete, 
and because the Moon is an easy object to observe the mismatch 
between his, and indeed any, prediction and reality was apparent. 

Newton ’s calculations in Principia |, 45 showed the elliptical orbit 
rotates slowly around the centre of gravity of the Earth and the Moon, 
returning to its original place every 18 years.* But, as Newton 


conceded in later editions of the Principia in a single crisp sentence: 
“The [advance of the] apsis of the Moon is about twice as swift”. He 
was unable to come up with significantly more convincing and accurate 
theory, although unpublished papers show that he was able to get a 
better approximation to the Moon’s motion, and it says a lot about his 
high standards that he was displeased that his approximate theory was 
out by a factor of 2. 

But this point is not merely technical; its implications were 
profound. This failure of Newton’s, it was thought, might be the loose 
thread that would unravel his theory. And his theory of universal 
gravitation was unpopular in Cartesian circles, and initially too difficult 
to understand in all circles. As a result, Newtonian gravity (more 
precisely, the inverse-square law) now came to stand or fall by its 
ability to describe sufficiently accurately the motion of the Moon. 

Significant progress on the question had to wait until 1747. In that 
year, Clairaut , wrote to Euler that it was “a proven fact that Newtonian 
gravitation is inadequate to account for the [lunar] phenomena’.® He 
therefore proposed to add a small inverse fourth-power term to the 
inverse-square law (making a law of the form f(r) = ar-? + br“+.) 


D’Alembert had come independently to the same opinion that Newton’s 
theory was incorrect, as did Euler, on the basis of his study of a 
different three-body system (the Sun, Jupiter, and Saturn), although 
each man had a different remedy in mind.’ 

Euler had already submitted his essay on the motion of Saturn to 
the Académie des Sciences in Paris for consideration in their prize 
competition. In it, he expressed his doubts about the inverse-square 
law, particularly Newton’s failure with the motion of the apse of the 
Moon, and he hinted that he wished to re-introduce vortices (thus he 
framed what Newton would have called “an hypothesis”, an ad hoc 
mechanism). Clairaut , one of the judges, read the essay in September 
1747, recognised Euler ’s handwriting, and wrote to Euler on 11 
September to say that he was delighted that Euler had thought about 
Newtonian attraction. Clairaut went on 


It is true that on adding some other term one feels that the 
theory will better accord with the phenomena. But it seems to 


me that this term must be such that at the distances of Mercury, 
Venus, the Earth and Mars it must be almost insensible, in view 
of the extreme smallness of the motion of the apsides. And if, as 
it seems initially from your work, the law of squares is palpably 
in error at the distance of Saturn and Jupiter it would still be 
necessary to add terms which were significant only at that 
distance. I confess that the whole of gravitation seems to me to 
be only a speculative hypothesis. 


He then remarked that 


It seems to me, and I am not a candidate for the prize, much 
more important to know if Newtonian attraction holds or not 
than to treat simply of Saturn. And in seeing if the square law of 
attraction must suffer some correction which can only be for 
small distances it seems to me to be necessary to begin by 
finishing the theory of the moon. 


However, Clairaut soon withdrew the suggestion that the modifying 
term should be an inverse-fourth power, because it predicted that 
objects near the surface of the Earth should be heavier than they are. 
He also rejected Euler’s vortices, which he thought Euler himself had 
shown to be no help at all.® 

In Clairaut ’s view, part of the problem was that Newton’s Principia 
was difficult to understand.’ He praised it, which was still a 
controversial thing to do in France, by saying 


The famous book The Mathematical Principles of Natural 
Philosophy has been the occasion of a great revolution in Physics. 
The method which Mr Newton, its illustrious author, has 
followed to derive facts from their causes, has shed the light of 
mathematics on a science which up till then had been in the 
shadows of conjectures and hypotheses. 


and then he turned to say what had to be done next. The problem was 
not that Newton concealed his fluxional calculus that was easy to 
supply. Rather, 


is it not right to reproach him for another wrong which without 
doubt has struck all those who have studied his book with a true 
desire to understand it? Namely, that in most of the difficult 
places he employed too few words to explain his principles [...]. 


That said, Clairaut reflected, so much else was right—“Kepler’s laws 
[...], the movement of the nodes of the moon [...], the tides, [...] and 
finally several other questions equally favourable to attraction [that] it 
appeared to me as difficult to reject as to accept”. 

Clairaut began to work intensively on the law of gravitational 
attraction, and on 17 May 1749 he announced his surprising conclusion 
that, by taking a new point of view, he had found that the problem 
disappeared, and the inverse-square law could give the correct 
prediction for the apse line of the Moon. 

Euler was not immediately convinced, and in 1749 he persuaded 
the St Petersburg Academy to have a prize competition, and suggested 
several propitious topics, all of an astronomical nature. They chose this 
one “To demonstrate whether all the inequalities observed in lunar 
motion are in accordance with Newtonian theory—and if they are not, 
to demonstrate the true theory behind all these inequalities, such that 
the exact position of the Moon at any time can be computed by means 
of it’, and Euler became one of the judges, indeed the decisive one.!? 

Clairaut hesitated over whether to enter the competition. He 
published his paper, which d’Alembert criticised, and only submitted 
his entry in December 1750. The Academy had by then decided to 
extend the competition to June 1, 1751, but when they sent Euler 

Clairaut’s entry he replied that it “is superb, and it is hardly likely that 
anything better will be received prior to June 1”. He repeated his 
endorsement, and admitted that he had changed his own opinion, in 
the official statement he wrote on 5 June, and the result was announced 
on 6 September 1751.!! 

In his paper of 1749, and again in 1753, Clairaut argued that the 
error lay in the poor way in which exact, unsolvable equations for the 
motion of the Moon had been reduced to inexact, approximate, but 
solvable equations. 

Clairaut formulated the problem of the motion of the Moon in terms 
of differential equations, and after integrating twice found this 


expression for the solution, in which Q is an unknown function of r, the 
radial distance of the Moon, and of the perturbative force of the Sun: 
$e 
rr ]l — gsinv—gqcosv+sinv [ s2c0s nav — cosy [ 9 sin vdv.(A.1) 
r 


Here, fg, and q are constants of integration, M is the sum of the masses 
of the Earth and the Moon, v is an astronomical quantity called the true 
anomaly (which may be taken to represent the velocity of the Moon). 

To find QQ, Clairaut employed a process of successive 
approximations. The apse of the Moon was understood to move rather 
as if the Moon precesses on an ellipse. So Clairaut , following Newton, 
first wrote the equation of an ellipse in polar coordinates 


—=1—ecosmy. (A.2) 
r 


Here, k, e, and m are constants that are either to be determined from 
the constants f, g, and q or otherwise found from observation. In 
particular, e was already known empirically to be about 0.05. This 
means that as dv = 0 varies between at most +1 and +1, rvaries 


between C,. 


Clairaut substituted this approximation into his original equation, 
and obtained this better approximation to r: 


k 2v ps 2 
—=1-ecosmv+fcos—+ycos|——m]v+o0|—+m]v. (A3) 
r n n n 

Here n is another quantity determined by observations (and therefore 
known) and n, A, and y are constants determined from the other 


constants so far; the new terms describe the way the ellipse slowly 
changes. 

Clairaut now explained his original mistake, which had led him to 
deny the inverse-square law. He evaluated n, A, and v and found that 


to nine decimal places 


B = 0.007090988, y = —0.00949705, and 6 = 0.00018361 . 


These numbers are much smaller than e, and he felt that they were 
already too small to allow his method to double the value of m, so he 
did not seek a better, second approximation. Accordingly, he had been 
inclined to believe that the error must lie in the inverse-square law, 
which consequently needed amending. But in the spring of 1749 he 
calculated the next approximation and found that his hunch had been 
wrong. It turned out that the contributions coming from the 2 term 


were not only quite large, they were proportional to the transverse 
perturbing force, whereas the initial contribution to m related only to 
the radial perturbing force. It was only by going to the second 
approximation that Clairaut could pick up the effect that was making 
the Moon’s ellipse precess. Now, on calculating these numbers, Clairaut 
found that the monthly apsidal motion was 3°2’6”, which was just 2’ 


less than the empirical value that he accepted. 

It must be said that even Euler found Clairaut’s method difficult to 
follow in detail.'* But in the end Newtonianism became accepted very 
much as Newton had presented it, in that 


1. 
it was a highly mathematical theory of the solar system; 


the predictions that it made rested on a highly theoretical analysis; 


if its conclusions were accepted then its theoretical 
presuppositions seemed inevitable, provided the mysterious force 
of gravity was accepted as really existing. 


This vindication of Newton’s theoretical approach gave 
mathematicians the confidence to deal for the first time with many 
more of the most interesting aspects of the physical world. 


Appendix B 


Characteristics 


B.1 First-Order Linear Partial Differential Equations 


The simplest partial differential equation in two variables x and y that 
one could hope to solve is p(x, y)—which is essentially an ordinary 


differential equation—and its solution is ~(xo) = Wyo), where fis any 


function of y. Notice, crucially, that any solution is constant along the 
curves y = const., and also that it can vary arbitrarily—not necessarily 


even continuously, from curve to curve. 

A modern approach to linear partial differential equations in two 
variables aims to reduce them to this form, and so reduce the problem 
to one in ordinary differential equations.!° 

Consider the equation 


du, + buy = 0, 


where a and b are constants, not both zero. One solution method 
notices that the equation says that the directional derivative of the 
function u vanishes in the direction (a, b) and so any solution u is 
constant along lines with equations of the form bx — ay = c,and so the 


solution to the partial differential equation is 
Yor = 411Y11 + 412912 
which is constant along the lines bx — ay = c. 
A second solution method changes variables to 
€=ax+ by, n = bx -ay, 
observes that as a result 
Ux = Aue + buy, Uy = bug — AUy 


and so the partial differential equation becomes 


(a> + bug == (i 
This gives the solution 
u(é,n) = fm) = f(bx — ay), 


as before. 

Once again the solution is constant along a family of curves—called 
the characteristics of the partial differential equation—and is therefore 
determined by the values specified on any curves transversal to the 
characteristics. 

The method of characteristics applies to linear first-order partial 
differential equations with variable coefficients: 


a(x, y)Uy + D(x, y)uy = 0. 


The characteristics are now curves along which the directional 
derivative of u vanishes. 
Grigoryan gives this example 


Ux + yuy = 0. 


We want the directional derivative in the direction (1, y) to vanish, that 
is, along curves for which 


dx a 
and these are the curves with equations dz = dx, where C = ye“ isa 


parameter that varies from curve to curve. As before, the solution to the 
partial differential equation is constant along these curves, so it is given 
by 


E(x, 9, ¥',y’) = 0. 


Or we could argue that if A = 6(1) — @(O) then we always have 


u,dx + uydy = 0, 


and so 


which leads to the same conclusion. 
As before, we can also solve the equation 


A(X, y)Ux + D(x, y)uy = 0. 
by changing variables to 
& = €(x,y), 7 = (x, y). 
We have 
Ux = Ugéy + UpNx, Uy = UgEy + UpNy, 
and so the partial differential equation becomes 
(a&, + DEy)ug + (any + bny)u, = 0. 
Ifwe set 6 = a + | —y then the partial differential equation becomes 
the ordinary differential equation b = oo, which we solve. The 
solutions are of the form Cj, C2, C3,..., and once again they are 
constant along the curves z = a,b,c. 
So we now have to solve the equation 6 = a + 1 —y, and 
(mimicking an earlier argument) we deduce from u = u(x, y) and 


U,dx + uydy = 0 that 


as before. 
Indeed, the solution to the ordinary differential equation 


dy _ by) 
dx  a(x,y) 


is given by an equation of the form F(x) = +7/4. This can be written 


locally, if we allow ourselves to share the confidence of the eighteenth 
century authors, as y = f(x) +c. 


Define 7 = y — f(x) and p+q,so 


d 
Ux = Ug t+ Uy (-7}, Uy = Uylly 
ax 


and in the new variables the partial differential equation becomes 
b 
a\ Ug + = a, | + buy = 0, 


aug = 0, 


or 


the solutions of which are 
u(é,7) = g(7), 


and these values can be prescribed arbitrarily along the curve 
ee ad Na 


To sum up the story so far, the first step in solving a linear first- 
order partial differential equation is to change variables, so almost 
automatically we use the formulae 


Ux = Ug&y + UyN x3 Uy = Ugéy + UyNy. 


The second step is to change variables in such a way that one term in 
the partial differential equation vanishes. This requires solving an 
ordinary differential equation for the characteristic curves. The third 
step observes that the solutions are constant along the characteristics, 
and so (fourth and final step) these values are picked up from a given 


set of initial conditions that specify the values of a solution along a 
curve transversal to all the characteristics. 


B.2 Burgers’ Equation, a Non-linear Equation 
Burgers’ equation, 


u; + uu, = O, 


is a non-linear equation, and it displays interesting phenomena.'* The 
characteristics are straight lines of varying slopes, and the solution is 
constant along the characteristics. But now suppose, for example, that 
the slopes of all the lines through the x-axis for negative x are steep and 
positive, and that the slopes of all the lines through the x-axis for 
positive x are shallow and positive. Then none of the first kind of 
characteristic lines meet any of the second kind, and as t increases a 
gap opens up between the first kind and the second kind. This is called 
rarefaction. 

It is more interesting if, instead, the slopes of all the lines through 
the x-axis for negative x are shallow and positive, and that the slopes of 
all the lines through the x-axis for positive x are steep and positive. 
Then all of the first kind of characteristic lines meet characteristic 
lines of the second kind, and as t increases it becomes impossible to say 
what happens when the characteristics cross (because the solution 
cannot have two different values). This is the phenomenon ofa 
shockwave. 

We are given the quasi-linear equation 


Uu; + a(u)u, = O 
with initial values on the line t = 0. 


The characteristics in this case are the solutions of the ordinary 
differential equation 
dx 
—=alu). 
dt 
The theory of characteristics says that every solution u remains 
constant on each characteristic, and so the slope of each characteristic 


is constant and the characteristics are straight lines. 
There is a characteristic for each point z, = 0 on the line ¢t = O and 


its slope is given by the value of u at that point. So if the values of u at 
Z, =O and z, = 0, with z = z(t), are a; and a, and P(q,...,gn) =A 


then the characteristic through z, = 0 has a lesser slope than the 
characteristic through z, = 0 and will eventually cross it. This shows 


that the solution cannot be continued beyond that point. 
Courant and Hilbert (Vol. 2, Appendix 2 to Chap. II) also pointed out 
that the partial differential equation with initial values given by 
u(x, 0) = v(x) has a solution in the form 
AP > Gy +. 


and the implicit function theorem implies that u is a differentiable 
function of x and tas long as the u derivative of a1, @2,...,@, does not 


vanish, a condition that holds whenever ta’(u)y # 1. Whenever this 


condition is violated one can expect u to become singular. 
Another good discussion of Burgers’ equation 


u; + uu, = O 


is given in Evans ([100], 140-144). He takes initial data on the axis 
t = 0 that is given by a function k€ = N7z that is defined as 


1 ifx <0 
eyes 1=2 10s 7s 
0 1 = 1, 


The characteristic through xg is x(t) = g(xo)f + Xo, t = O and along it 
any smooth solution takes the constant value z = z(x,y). Itis 


instructive to plot some of these before proceeding. Accordingly, the 
solution function is 


1 ite e 0471 


rie xe dd Os te I 


6 esd 0272. 


g(x) = 


The method breaks down when t = 0—note that u is apparently 
infinite when t = 0—and in this case the characteristics cross. The 
most visible case is the way the characteristics through points (x, y) 


meet the ones through points p(x, y). 


Appendix C 


The First-Order Non-linear Partial Differential 
Equation 


For reference, here is a statement of the existence and uniqueness 
theorem for the first-order partial differential equation 


F(x, ¥,%, P,q) = 0 


(from John ({153], 29)). 
The function F has continuous second derivatives; 
Along an initial curve (xo0(5), yo(s)) initial values x — t are assigned, 


and Xo, Yo, Zo have continuous second derivatives; 
There are two continuously differentiable functions s = € and go(s) 
such that 
F(xo(5), Yo(S), Z0(S)s Pols), go(s)) = 0 


and 


d d d 
a£0 is poke 


dig de. Peds: 
The transversality condition: 
dxo dyg 
ee q\X0»¥0> 20> Po» Fo) — Taf PL%0» Y02 30: Po, 0) # 9. 


Then in some neighbourhood of the initial curve there exists a unique 
solution that contains the initial strip, i.e. 


2(X0(8); Yo(S)) = Zo(S), Zx(X0(S), Yo(S)) = Pols), Zy%o(S), Yo(S)) = Gols). 


We shall now see that the equation has a unique solution. 
Solutions of the general first-order partial differential equation 


F(x,¥,Z P,Q) = 0. 


with whatever conditions on F that seem necessary will form a two- 
parameter family of surfaces together with whatever envelopes one- 
parameter families may form and any envelopes the two-parameter 
family has (there may be none). 

We now make a standard move in the subject and attempt to 
accumulate so many necessary conditions on any candidate for a 
solution to the partial differential equation that together they form a 
set of sufficient conditions that enable the problem to be solved. 

Fix attention on one of these surfaces and suppose it passes through 
the point II(z) = I'(z + 1). The equation 


F(X0; Yo: 20, P» J) = O (C.1) 
is an equation relating p and q at P,, and we can think of it as defining 
q as a function of p: g = g(X, Yo, Zo, P). So we have a one-parameter 


family of planes through P; with equations: 


Z— % = P(x — Xo) + GY — Yo), (C.2) 


These planes envelope a cone through P, that is tangent to the surface. 


To find this envelope, we differentiate the above equation with respect 
to p and obtain 


dq 
VS 2= 264 =) 
0 + (y— yo) ap (C.3) 
From the first equation we obtain 
dF dq 
— =) SFr is 
and so we may write 
aa ae mae 
F, FF, ? (C.5) 
and therefore 
a 0 £20 ee 
F, Fy PFp+4F, (CS) 


From Eas. (C.2) and (C.5) we deduce that these values of p and q define 
a tangent line in the cone that is tangent to the surface and is given by 
Eq. (C.6). 

The tangent line is a tangent to the characteristic curve in the 
surface through P, which therefore satisfies the equations 


dx dy dz 
Fy Fy” DF p+4F, C7) 
p q Ppt hq 
which we can regard as 
ee a ape (C.8) 
da Pda Va eee 


This is not enough information to determine the surface. But we also 
know that any characteristic curve in the surface (regarded as a curve 
parameterised by t) must satisfy the equations 


= pe at po = PxF'y + PyFg 
and 
“ = acm a a = 9xl yp + HF 4g 
We can also differentiate the equation F(x, y, z, p, g) = 0 with respect x 
and with respect to y and obtain 
Fy, +F.p+Fypyt+Fogx=0 and Fy+F.g+ Fypy + Pody = 0. 


Therefore 


dp dq 
—=-=-F,-F,p, = 
dt Poy 


Now we have five equations for the functions that define the 
characteristic curves and fill out the solution surface, 

x(t), y(t), z(0), p(0), q(t). This is too many, but the condition 

F(x, y, Z, P; g) = 0 implies that a — (), So it merely says that F is 


==f, => Fg. 


constant along any of these curves, which is as it should be. 

It remains to check that these characteristic curves define a surface 
that is a solution to the partial differential equation, and that a surface 
can be found passing through any initial curve 
(xo(S), yo(s), Zo(s)), O < s < 1 that is not a characteristic curve. The 


five equations for dx dy dz dp dg are all ordinary differential 


dt’ dt’ dt’ dt’ dt 
equations. We suppose that when f = 0) 
xX = Xo(S), Y = Yo(S), Z = Zo(S), PD = Po(S), g = Go(s)—the first three of 
these equations place ZN = OC on the initial curve. We appeal to the 


existence of solutions to an ordinary differential equation to deduce 
that in the neighbourhood of the initial curve there are functions such 
that 


LEXSA) VHS) c= Zs, pS PS 9g = OS.) 
and 
X(s,0) = Xo(s), Y(s, 0) = yo(s), Z(s, 0) = Zo(s), P(s, 0) = pols), Q(s, 0) = ols). 
It remains to check that these curves define a surface z = z(x, y) and 


that this surface is a solution of the partial differential equation. To 
eliminate s and t so that we can write 


(Preppy) Fon? Dl py Pl pn = T= 0, 


(if you forgive the ambiguous notation for a moment), requires that the 
Jacobian ( Ne XG | be invertible when t = 0, which it is because its 


Y,.. .¥4 
determinant = c= at F, is non-zero. 
Finally, to see that z = z(x,y) is a solution of the partial differential 
equation we check that 
eV a(S ee) Res Vici) 
which involves checking only that X — x9 and g = z,. 


However, the argument to establish this conclusion is somewhat 
roundabout. One defines 


U = 2% — PxXt— Qs 


V = 25 — PXs — ys. 


If we see these equations as equations for p and q, we can deduce that 


(oe )= (229): 


We can solve these equations provided that x;y, — x,y, # 0, which we 


have already observed above. 
Then we treat the equations 


O = 2) — 2X; — LyVto 


M(dx + Vdy) = ds, 


in the same way, and deduce that 


amas ee a 
Xs ys }\ Uy Zs J 
Now we can see that our remaining problem is solved if U = 0 and 


cos A.. Happily for us, cos A. is a consequence of the characteristic 


equations. The requirement on U is more difficult. We regard U and V 
both as functions of sand t. Then we calculate DU OV anduse the 


Os Ot 
fact that cos A. to deduce that OV _ 0 to obtain an expression for OU. 
Ot Os 

Then, from the equation cos A. we deduce that 

OU 

Os 
We treat this as an ordinary differential equation for U as a function of 
u for each fixed t and solve it to obtain 


Sf, 


U(s) = UO)eb Fut 
But d(a + if) and so [I(O) = | for all s. This concludes the proof. 


Note also that nothing more has been required of these functions 
than that they are at least two times continuously differentiable with 
respect to the obvious variables. 


Appendix D 


Green’s Theorem and Heat Conduction 


Green’s theorem proved endlessly useful in the study of the partial 
differential equations of mathematical physics, as the following 
examples illustrate. 


D.1 Explicit Representations 


To proceed further, we come down to two dimensions. The same 

process applies as it did in three dimensions, except that vis to be 

infinite like + In r ata point of D. If the Dirichlet problem is posed for a 
TT 


two-dimensional region D with boundary C then the relation between 
the Dirichlet problem and Green’s problem is given by 


uP) = { f(.Vv) 
C 


for a suitable function v. 

The Dirichlet problem for the unit disc asks for a harmonic function 
u on the disc that agrees with a given function fon the unit circle. If we 
take the approach of finding a Green’s function then we have to find a 
function v(x, y). 

It helps to be clear about domains and coordinates. We define v on 
2n — 1 and write z = 0, 1, co. We then define 


v(x,ysEm) = 5 In (0x ZF + = ?)!”) + ACY, 


for some suitable function x, x,,...x, that we have to find, where we 
require that 


Vnr=0 onD, 


and, for v to vanish when €,77 € C that 


3 oe 
h(x,ys€m) = —s-tn(((x- 2 +-?)"”) EMEC 
So we have set (x, 5 5 Inr + h(x, y) where h is to be harmonic 


everywhere in the disc and has Ee prescribed behaviour on the 
boundary that f(x, y,é,7) = -—+ = In r, Where r is the distance from (x, y) 


to (€,7), when 0< 5s <1. 


If we let D be the upper half-plane, so the boundary C is the real 
axis, then we define x, x;,...X, by looking at the mirror image of the 


point (x, y) in the real axis, which is the point dv = 0. We now define 


1 / 
hGH, 7) = eT : 


where 7” is the distance from dv = 0 to (€, 7). This function fails to be 


harmonic only at dv = 0 which is not in D, and it is equal to + Inr on 
TT 


the boundary, as required. 
If we let D be the unit disc and take v = Inr. The unit normal vector 


at a point (x, y) on the unit circle is (x, y). The above argument says we 
have 


u(Q) = fn.Vv. 
\z|=1 
Here 
1 
Vv = —(0; logr, 0, logr). 
20 
We find 
aa ede! 
(=e) sana Oe) 


So 


SO 


peer | 


and similarly dy logr = ~ 
7 


2 
X y. x+y 
n.Vv = (x, y).(5. 5) = —— = 1 
rp 72 r 


and therefore 


u(O) = fdz. 


Izl=1 


This expresses the averaging property of harmonic function on a disc: 

its value at the centre is the average of its values on the boundary circle. 
What happens if we move the point where Green’s function 

becomes infinite to somewhere off-centre but still in the disc, say a — | 


, |a| < 1? There is a M6bius transformation that maps the unit circle to 


itself and the point @ to the origin, 


az—b 
y(z) = —— 
—bz+4 


where a = b/a. The corresponding Green’s function is 


| 
— | 
On - 


az—b 


—bz+4 


There is a classical argument called the method of images that finds the 
harmonic function explicitly from its boundary values when D is the 
unit disc. We move the infinite point to P = (x, y) or (€,77) in polar 


coordinates, define P’ = (o’,@) where z’ = zK—the point P’ is called 


the image of P because it is obtained by inverting P in the unit circle 
(Fig. D.1). 


Fig. D.1 Finding Green’s function at an off-centre point 

Now we look at the point Q with (Cartesian) coordinates (€,77) and 
polar coordinates (f, 9). We let O denote the origin, and let r denote the 
distance PQ and r’ denote the distance P’Q. Then by applying the 


cosine rule first to triangle OPQ and then to triangle aoa’y we find that 
1” = p’ + p* — 2ppcos(6 — 8), 


J 


ry? = one p — 2p'pcos(@ — 6). 


When Q is on the boundary of the circle, so |OQ| = p = 1, we find that 


r 1+p*?-2pcos(6-6) 
r? 14 1/p*-2/pcos(@-6) ° 
So the Green’s function for the Laplacian is now 
1. or p+ p% —2ppcos(6 — 4) 
v(x, y) = — ln — = ———_. 
4n r’p) p*p2 +1 — 2ppcos(6 — 4) 


In this case, 


evaluated at t + dt, which is 


] 1-p 
2 1 + p? — 2pcos(@ — 6) 


This gives the Poisson integral formula for the disc: 


poee { " =P aap 
U 5 — a ——————— so . 
2m Jo 1+? —2pcos(@ — 6) 


Note that if ¢ + dt this collapses to the averaging result that we had 


before, as it should. 


D.1.1 Adjoint Equations 


A later generalisation of Green’s argument became known as the 
method of adjoint equations. The idea of introducing the adjoint 
equation of a given ordinary or partial differential equation is to make 
the original equation easier to solve, and as was briefly mentioned in 
Sect. 1.5 it was introduced by Lagrange in the context of ordinary 
differential equations. 

We write, following Sommerfeld ([248], Sect. 10) 

Ou Ou Oru Ou Ou 


Lu) = A—~ + 2B D 
YY) Ox? as Oxdy Oy? i Ox Oy 


where A, B, C, D, E, F are sufficiently differentiable functions of x and y. 
The adjoint equation to L(u) = 0 will be another second-order 


partial differential equation, M(v) = 0 such that 


OX OY 
vL(u) — uM(v) = oe + By’ 


for two functions X(x, y) and Y(x, y) that have also to be found. 


It turns out that 


O° Av OBy O&Cyv Ov Ov 


Mv) = 2 
”) Ox? 7 OxOy i" Oy Ox oy ° 


and 


(5 | (v5 | ( OA = 
X=Al——Uu—| +B vy =u | + | De — = 
y 


Ox Ox 7) oy Ox ody)’ 
X=A (eg +B ose + pas 
7 Ox Ox Oy dy Ox dy)’ 


In the last two equations, we can replace (x, A, D) by (y, CG, E). 
You can check that L(v) is the adjoint of M(u), and that L(u) = M(u) 


— when the equation L(u) = 0 is said to be self-adjoint—if and only if 


In particular, if the coefficients are constant and D = E = O the 


equation is self-adjoint. 

It is true, but will not be proved here, that if a second-order linear 
partial differential equation arises as an Euler-Lagrange equation then 
it is self-adjoint. It is also true that a second-order linear partial 
differential equation with constant coefficients can be made self- 
adjoint by multiplying it by a factor of the form exp()) Ax + py) unless 


it is the heat equation . 
Let us now integrate a + a over a region S with area element do 
et oy 


bounded by a simple closed curve C with line element ds. The theorem 
of Gauss and Green says that 


[exe —uM(v))do = Le + aa - | Y).nds. 
s s 


In the elliptic case in normal form, where A = C = 1, B = O, the RHS 


[{o E).n. 
C 


This generalises the usual Green’s theorem of potential theory (the 
case where also D = E = OQ), which says that 


[eae — uA(v))Do = LG + a do, 
s s\Ox Oy 


becomes 


where 


ane yk 


The importance of the adjoint equation arises from the fact that if u and 
v are such that 


Lu) = 0, M(v) = 0 


then the LHS’s above vanish, and the corresponding equations become 


[ex Y).nds = 0, 
C 


lige 5) aa = [ex Y).nds = 0 
Ox 


This holds provided that u, v, and their derivatives are continuous 
throughout the region S. If v, say, is discontinuous at a point z = 2z(x, y) 


and 


in S then it is excluded by drawing an arbitrarily small contour K 
around it, and taking the integral over both s in the positive direction 
and K in the negative direction. 


The most important single case is where the discontinuity of v at Q 
represents a point source of unit strength, one for which the yield q is 
the gradient of v. This means that at a radial distance n from Q 


Ov 
= —ds. 
: i dp 


If, very close to Q, v depends only on 7 then 


TT 
O 
G= oY odo = 27pVv. 
5 OP 

SO 

O 0 O O 

at alae» ae ee g tc 

dy Oyo Oyo ~~ Yo 

as p =z. 
So we have 


v=Ulogp+V, p= V(x-€? +0-)’, 


where U and Vare analytic functions of (x, y) and (€, 77) such that 
U > xas (x,y) > (7). 


This function v is called a Green’s function or a principal solution of 
the differential equation M(v) = 0. Similarly, the function u is called a 


Green’s function or a principal solution of the differential equation 
L(u) = 0. The functions U and V will be analytic if D, E, and F are 


analytic. 

Next, an account of how this comes about in the context of heat 
conduction, following (Sommerfeld [248], Sect. 12). 

The partial differential equation for heat conduction is (with y = kt 


Ou Ou 


Ly) = — = 
” Ox? Oy 
It is not self-adjoint; the adjoint equation is 
oy a0 
M(v) = z z =) 
"Oy 
Furthermore, 
xy Ou OV dy 
=y———u—, an =— 
Ox - 0x 


Because x is a space variable but y is a time variable, we choose to 
consider this only for simple regions bounded by sides parallel to the x 
or y axes. For such regions, along a side AB that is parallel to the x-axis 


ds = dxanddn = —dy; cos(n, x) = 0, cos(n, y) = —-1, 


and so 
B 


B 
i) (X cos(n, x) + Y cos(n, y))ds = — { Vax. 
A 


A 


Similarly, along a side CD parallel to the y-axis we obtain 


D C 
{ (X cos(n, x) + Y cos(n, y))ds = { Xdy. 
C B 


The general form of Green’s theorem then says 


i (vL(u) — uM(v))dxdy = { uvdx + if (= _ use) y 


We apply this to a heat conductor infinite in both directions and for 
which the temperature at time ¢ = 0 is given as = x + ay. The Fourier 


integral representation of the function f(x) is 


f(x) = me i - ( if 7 ripe) dw. 
D5 aed Oe, ee 


To obtain a solution to the heat equation , we multiply the exponential 
term by a function y(y) and plug this expression into the equation. We 


find that we require 


dip 
w°o(y) re 

SO 
yy) = Ce”, 


As we require = x + ay, we obtain the solution to the heat equation in 


u(x, t) = = { ~ { . fperore-rag| dw. 


Curiously, the resulting integral converges for a wider class of functions 
than does the original representation of the function f, and moreover 
the order of integration is now reversible. 

We now write y = ft and obtain the solution in the form 


u(x, t) = = i! ‘ ( ij . fipeo-wae] dw. 


The exponent is of the form g? + }? < ],s0 we complete the square 


the form 


—aw’ + Bw = -a (ww - F) 


2 2 
Fa ie 
2a 


Aq 
Set p = w — B/2a, and then 


ae ae ee ! a -ap? 
cas w wx) Jy = — ap" dy. 
Dre sen wo Co 


Write the RHS as U for the moment. 
There is a Laplace transform that was well known in the nineteenth 
century (and doubtless still is) that says 


{ e? dp = vu, 


SO 


and 


1 = aarrae 
Ve ; 
Arkt = ( Akt 


As t > 0 we have (1/x)"f.(1/x), so 


fe) = { flQua™, 


X+E 
{ Udé = |. 


For a source of heat concentrated at the origin we have 


and 


a, 


1 2 
U = — —]. 
Arkt ies | Akt ve 


D.1.2 Boundary Value Problems 


We shall suppose that the heat conductor is infinite in both directions 
and can be represented by the real line.!° We write 


Uxt)= { f(é)Udé, where U = exp (=H) 
=00 VArkt Akt 


We shall consider two separate boundary conditions. The isothermal 
one for a temperature distribution u(0, t), where we impose x = 0; and 


the adiabatic one for a given heat flow G(0, t), where we impose 


Ou/dx = 0. 


Both these conditions are satisfied if the function f, which is given 
only for a, 6, y, 6, A, is extended to the negative real function either as 


an even or an odd function, and so by a pure sine or a pure cosine 
integral. This yields 


win= [ fOuedes [° fOUCHbae 


The principal solution U(é) becomes 


til ~(x+é) 
U8 = ee ( SE) 


and describes a point source of heat at p = Z,,g = %. 
So we have 
Ux,1) = { f(€)G(é)dé, where G(é) = U(é) = U(-€). 
0 
The function G is called a Green’s function. It has only one pole (or heat 
source) in the interval a, 6, y, 0, A, and it satisfies the adjoint equation 
as a function of v and T because it is independent of tT and so the 


change of sign with respect to the T variable is irrelevant. 


The initial conditions are now to be represented by not a single 
point source but a continuum of them—first sum over a finite number 


of point sources of heat and then pass to an integral. They are placed at 
every a+ da. 


The corresponding function G is given by 
= 
G = UE) + AU(-é) + { a(n)U(n)dn, 


—CoO 


where the point source of heat contributes an amount A and the 
continuous source is represented by z = Z(t) 


The boundary conditions can the be used to determine A and J(@). 


Appendix E 


Complex Analysis 


It is impossible to explain Riemann ’s contribution to the study of 
differential equations without describing his approach to the theory of 
complex analysis, of which he is one of the three creators, alongside 
Cauchy and Weierstrass. But obviously there is no room here for a 
proper historical account of the emergence of key ideas on complex 
function theory. This chapter is therefore a series of glimpses into some 
of these ideas. 


E.1 Harmonic Functions 


We have seen that it is an elementary consequence of the Cauchy- 
Riemann equationsthat the real and imaginary parts of a complex 
analytic function are harmonic functions. We shall now see that given 
the real part (say) of a complex function on a simply connected domain 
the imaginary part is determined up to a constant (we assume the 
necessary partial derivatives exist and are continuous). Suppose that 
we are given u(x, y) and required to find v(x, y) such that 

u(x, y) + iv(x, y) isa complex analytic function. Then we may solve the 


Cauchy—Riemann equations 


Uy = Vy and uy = —V,. 
For these equations to have a solution v it is necessary that Uy, = —UWyy, 


because the equations we are trying to solve say that these expressions 
are each equal to Do, but this condition is met when the given function 


u is harmonic. 
We now solve the equation z = f(S') as follows. We can integrate 


both sides with respect to x to obtain 


y= - [ mdr. 


which is single-valued because u is defined on a simply connected 
domain, so this determines v up to a constant. Then we check that 
P1 — P2 by differentiating under the integral sign: 


Vy =— [ mae = [wad =e, 


as required. The function v is said to be the “harmonic conjugate” of the 
given function u. 

By the time Riemann was writing his Ph.D. thesis in 1850, the basic 
properties of harmonic functions were well known. For example, the 
value of a harmonic function h(x, y) at a point (Xo, yo) in its domain is 


the average of its values on a disc containing that point (this is little 
more than a statement of the Cauchy integral theorem, but it was 
known independently as a theorem in harmonic functions due to Gauss 
). Because an average value can never also be a largest (or a smallest) 
value unless all the values are the same, this means that a harmonic 
function takes its maximal value(s) on the boundary of its domain. This 
means that if a harmonic function is defined everywhere in the plane it 
tends to infinity as either x or y tends to infinity. In the context of 
complex analytic functions, this is Liouville’s theorem. Moreover, if two 
harmonic functions defined on the same domain have the same values 
on the boundary of the domain, they are equal. For, consider their 


difference. It is aharmonic function whose boundary values are zero, 
and by the earlier remark, this means that the difference of the two 
functions is zero everywhere in the domain, and so the two functions 
are equal. 


E.2 Branch Points and Many-Valued “Functions” 


Throughout the nineteenth century many-valued “functions” such as 
the nth root function, the infinitely many-valued log function, the 
arcsine function, and others were considered as legitimate functions. 
Let us consider the square root “function”, which assigns to a non-zero 
complex number its two square roots 


/2 7] 1/2 i0/2 


, or re Bs re? and rl eiG/249 = 71/2 gi6/2 


eee ras 
and investigate what happens as the domain variable is moved along a 
circle around the origin. 
For a fixed value of r, as v goes from 0 to P; and z goes once the 


origin on a circle of radius r, the square root that was initially positive 
goes on a semicircle from !/2 to —;!/2, the negative square root, and 
the root that was initially negative goes on a semicircle from —r!/2 to 
_r'/2, the positive square root. The two roots are interchanged in this 
process. 

Mathematicians of nineteenth century would say that the square 
root function is two-valued, and that it is branched at the point z = 1. 


Each single-valued determination of the function, either of the square 
roots, is called a branch of the function, and it is necessary to define the 
domain of these functions carefully, as we discuss below. 

The behaviour of the many-valued function (a;, b;) is not very 


different from that of z +5 z!/*. We set z = re!’ and observe that 
rel? = peilet2kn) 


so the values of Z° are all of the form 


i(0+2km)a i(0a+2kra) = elda erkina 


z2~=re = re 


and are obtained from each other by using a suitable multiple of e247. 


If p = 1/a then the function takes q distinct values, and if @ is 


irrational then the function is infinitely many-valued. 
The same account holds for the function 


a ee (Z = a)", 
which is branched at the point z = 1. 


Now let h(z) be an analytic function, and suppose that its power 
series expansion in a neighbourhood of the origin is 


A(z) =hh thyztho? +--+. 


Because each power of z in this expansion is integral we deduce that 
the function h(z) is single-valued in a neighbourhood of the origin. It 
follows that the function 


ZH Zh(z) 
takes as many values as does the function (a;,;). The function 
F(a, B, y, Z) is said to be branched at the point z = 1; Riemann said 
more informally that it behaves like 7 — 7” near the point a. 

Conversely, if the function (37/4, 1) is not single-valued near the 

origin one can look for a value of @ such that the function 

zh 2° F(z) 
is single-valued in a neighbourhood of the origin. 


For future reference, we note that if we differentiate the function 
F(a, B, y, Z) we find 


d 
7p E> DM) = AZ = a TAZ) + (= ah’) = (= a)" “(@h@) + (@ - ah'2)), 
so the derived function is branched like (z — q)?"! at z = I. 


As we shall see, we are free to consider a complex many-valued 
function that is branched at several points M(dx + Vdy). 


On the other hand, if we stay away from the branch points, and let 
the domain variable vary only along curves that do not go round a 
branch point, we can recover a single-valued function, albeit on a 
restricted domain. This function will be the restriction of a branch of 
the many-valued function to the restricted domain. In the main, this is 
what Cauchy did in developing his theory of complex functions. 

For example, if we pick a value of the square root function at the 
point z = 1, say the value +1. Then near to z = | the value of the 


square root function that we shall choose is, of course, the value near to 
2. We can proceed in this way by varying z and assigning the unique 
value to the square root function that makes our new function 
continuous, as long as we do not take z on a loop around the origin. One 
way to do this is to decide not to consider values of the square root 
function on the negative real axis, the points z for which z = |. The 


domain of the square root function has been restricted to the plane of 
complex numbers with the negative real axis removed—this is 
commonly called a cut—and on that domain it is possible to choose a 
single-valued branch of the function. More complicated many-valued 
functions could similarly be restricted to other simply connected 
subsets of the plane of complex numbers and in this way studied, but at 
the price of introducing an element of arbitrariness into the theory. 
More precisely, on the domain {re’® : —m < @ < 7} we define the 


function re +5 re’®/2. The image of this function is the right-hand half- 
plane {pe’? : —2/2 <y < 2/2}. 


If, however, we cross the negative real axis, the value of the square 
root function would be multiplied by +1. More precisely, on the domain 


{re : —n < 0 <7} wealso define the function re 15 —re!®/2. The 


image of this function is the left-hand half-plane 
foe’ : n/2 <y < 3/2}. 


We are free to choose either branch in order to assign a value to the 
square root function on the cut, and it is natural to assign points re’ 


the value ;!/2,i7/2, 


We shall refer to this later, so let us say in general that a many- 
valued function fis locally single valued if it is possible to choose a 
domain D, such as a disc, that contains no branch points of f, a value of f 
ata point y; in the domain D, and a single-valued function F on the 


domain D that agrees with the branch of fon D that takes the specified 
value at the point y7;. 


E.3 Analytic Continuation 


We have seen that Gauss was clear about the distinction between a 
complex-valued function of a complex variable and a power series 
representation of that function. The rule of thumb for a power series 
representation is that it is valid on an open disc centred at a point P and 
of radius (a, 0) that is determined by the fact that there is a point on 


the boundary of the disc where the function becomes infinite (more 
precisely, ceases to be defined and analytic). 
Thus the power series representation of the function (| — z)"!, 


which is 


n 


Ltztedetztte-- 


converges on the open disc centre the origin and radius 1, because the 
function is not defined at the point z = |. The same is true of the 


function (| + z7)~!, because it is not defined at the points v + dv that 


lie on the boundary of the unit disc. 


Very often one obtains a function by first finding a power series 
representation of it.1° If, for example, one has the series 


Wey ae eee ee ae ee 


then one has a function defined on the open unit disc. But it is possible 
to obtain a power series representation for this function on a disc 
centred at the point 7 = 4 by introducing the new variable z, = z+ i 


and the radius of convergence of this power series is 3/2—the distance 
from z = -} to z = 1. The disc on which the new power series 


converges partly lies outside the unit disc, so the function defined by 
the power series can be extended to a larger domain. In this way, the 
function can be extended to its maximal domain, which will be the 
plane of complex numbers with the point z = | removed, because, of 


course, the power series are has been obtained from the function 
(1 — z)~!. This process of extending the domain of definition of a 


function is called analytic continuation. It can be applied to any function 
defined initially on some open disc by a convergent power series, and in 
the present context the examples we shall consider will produce many- 
valued functions defined everywhere except at a finite set of points.'” 

The basic facts about analytic continuation concern what happens if 
a function is continued in the way just described along a chain of discs, 
each one overlapping with the one before. For as long as this can be 
done, one says that the function is obtained by analytic continuation 
from the original disc. If you wish, you may suppose that the domain 
variable moves along a closed curve and at each point is the centre of a 
disc of possibly varying radius which is the disc of convergence of a 
power series representation of that function. But a question arises 
when a disc overlaps one much earlier in the chain of discs. In this 
situation, it can happen that the values of the function on the first and 
last discs are the same, or that they differ. They differ only if the chain 
goes round a branch point (which must not lie in any of the discs). The 
square root function is a case where something can go wrong. 


If, on the other hand, a function is extended analytically from the 
same initial disc along one chain to one disc, Do, and along another 


chain to another disc Do and the extended functions agree in the 
intersection of Do and Do, then nothing can be said. It might be, for 


example, that the function is the square root function but one of the 
chains goes twice round the branch point at the origin. In this case both 
chains extend the function to the same value. 

However, if we add the condition that the first path can be deformed 
until it agrees with the second path and at no stage does the path as it is 
deformed pass over a branch point, then the result of analytic 
continuation along the chains will necessarily be the same. For the 
purposes of this chapter, let us call this result the deformation 
principle. 

The basic principles of analytic continuation were certainly known 
to Riemann, but it was Weierstrass who preferred to build a theory of 
analytic functions this way. 


E.4 Liouville’s Theorem 


There are many ways of distinguishing between a complex analytic 
function and a map from R? to R3. We have already remarked that a 


complex analytic map is infinitely complex differentiable, and (asa 
result of theorems due to Cauchy ) it is expressible locally as a 
convergent power series. As noted above, Riemann was the first to 
draw attention to a point known earlier to Gauss : at every point where 
a complex analytic function has a non-zero derivative it is conformal or 
angle-preserving. But one of the most surprising properties of a 
complex analytic function was discovered by Joseph Liouville ; it was 
also known to Riemann, but it is not clear on what grounds he believed 
it. 

The theorem says that a complex analytic function that is bounded 
everywhere, even at infinity, is a constant. One proof uses the Cauchy 
integral theorem to estimate the coefficients in a power series 
expansion of the function and to show that all terms but the constant 
term are arbitrarily small and must therefore be zero. Another proof 


uses the fact, also proved by Riemann, that the real and imaginary 
parts of a complex function are harmonic, together with the fact that at 
any point the value of a harmonic function is the average of the values 
the function takes on a neighbourhood of the point. This means that a 
harmonic function can only take a maximum or minimum value on the 
boundary of its domain, so if that domain is the entire plane including 
the point at infinity a bounded harmonic function must be constant. 
The use of Liouville’s theorem is to show that if the quotient of two 
complex analytic functions is bounded then the functions are complex 
multiples of each other. So for example, if a polynomial p(z) of degree n 
and the function 1 are such that the quotient 1/p(z) is bounded 
everywhere then p(z) is a constant. Consider now a non-constant 
polynomial with no zeros. It is therefore bounded away from zero, and 
so its quotient is bounded everywhere, and is therefore constant, which 
is a contradiction. Therefore the polynomial p(z) must have a zero—the 
fundamental theorem of algebra. In the same way a function that has 
zeros at the points 1, U2,...,U,, and becomes infinite like 1/z (simple 


poles) at the points wy, U2,..., Um and is otherwise neither zero nor 


infinite (including at infinity) must be a constant multiple of the 
rational function 


(cape = aa) (2 = aa) 
(2 — biz — bo) ++ (2 - Bn) 


Appendix F 


Mobius Transformations 


Mobius transformations are needed for a look at the work of Schwarz 
and Poincaré on the hypergeometric equation, which requires a 
modest amount of complex analysis. 


F.1 Mobius Transformations 


A Mobius transformation is a map of the complex plane to itself of the 
form 


az+b 
czt+d’ 


where a, b, c, dare complex numbers and ad — bc # (). We say that the 


point z = —d/é goes to © and that © goes to a/c, and strictly 


speaking we should say that the map is of the extended complex plane 
to itself. 

A proper Mobius transformation is a map of the complex plane of 
the form 


az+b 


a re 
cz+d 


A proper Mobius transformation is obtained by following the map 
x + ly with a Mobius transformation, and for most purposes it is 


enough to work with proper Mobius transformations. I shall drop the 
work “proper” when it can be inferred from the context. 
Mobius transformations have the convenient property that the 
transformations 
az+b kaz + kb 


d —_—_, 
Beh. se ena 


are the same, and so the inverse of the Mobius transformation 


k #0, 


az+b 
La? ee ee 45 
czt+d 
is easy to write down: it is, as you should now check, 
dz+-—b 
ZH ——. 
—cz+a 


Show that if 2 is the Mobius transformation 


az+b 
czt+d’ 


and py’ is the Mobius transformation 

az+b’ 

c’z+d" 

then the Mobius transformation ju’ (performing yp’ first) is 


az ae b” 
> ———_——. 
c"z +d! 


9 


where a”, b”’,c’’,d” are given by the matrix product 


a’ b”’ a b a’ b’ 
& eae a 
This allows us to use the convenient notation for the Mobius 
transformation 


az+b 


Le 
cz+d 


Zi A); 


where A | ab | So, writing P’ the matrix for py’ in the obvious 


way, the matrix for yw’ is AA’. 


The derivative of the Mobius transformation 


is 


a(cz+d)-—clazt+b) _ ad-—be 
(cz +d) ~ (ez+d)?’ 


f'@= 


This means that the Mobius transformation is angle-preserving 
everywhere. 

Because a Mobius transformation is determined by the three ratios 
a:b:c:ditis easy to see that there is a unique Mobius transformation 
sending any three distinct points to any three distinct points. In 
particular, the Mobius transformation 


az+b 


Zk 
cz+d 


has this effect: 
Or b/d, 1H (at+b)/c+d), ora/c. 


Show that the only Mobius transformations that map the set of three 
points {0, 1, co} to itself are given by 


1 — | ] 
Fe eas eg es aa ec and 
Zz z z-l l-z 


This group of transformations is important in the study of Riemann’s P- 
functions. 
The equation of the circle centre (a, b) and radius ris 


x+y" -2ax-2by +c =0, 


where /2 = q? + b? — c. Show that the equation can be written as 


et Uy = AG Ve, 
where y; = const and ¢ = a@ — r~. The equation of the circle can also 
be written in the suggestive forms 


az—-C 
LZ 


Z-a@ 


and 
Z= Az, 


where | a —-c | Thus, for example, the unit circle, which has 
1 -a@]} 


equation a, 6, y, can be written in the form 


z= Az, 


where 01 
ca) 


It is sometimes convenient to speak of the circle n = j, which has 


radius r given by c = a@ — r’. 


F.2 Inversion in a Circle 


Inversion in a circle of radius r and centre O is the map that sends a 
point P to the point Q on the half line OP and such that OP.OQ = r2.In 


particular, inversion in the unit circle with centre at the origin can be 
written as 
= 6 
ae 1/Z ==. 
va 


Show, by using a sequence of scalings and translations, that inversion in 
the circle centre @ and radius r is the map 


r r az—-C 
VS ee +Q= 


Za Bae Lo 
We shall be interested in the effect of inversion on lines and circles and 
on angles between lines and circles, and without loss of generality, we 
may assume we are inverting in the unit circle, a, 6, y. 


Consider the locus defined by kzzZ — az — az +c = 0, where k= 0 or 
1, which defines a circle when u,/u, and a straight line when u,/uy. 


Show that under inversion, this transforms to the locus 
kit —~ @t -—~a@t+c = 0, which simplifies to kzZ -@z-a@zZ+c = 0. This 
Yae va Zz 


yields four cases: 


(1) 


M — iN =T: a circle not through the origin goes to a circle not 


through the origin; 
(2) 


M — iN =T: a circle through the origin goes to a straight line not 


through the origin; 


(3) 


M — iN =T: a Straight line not through the origin maps to a circle 


through the origin; 
(4) 


M — iN =T: astraight line through the origin maps to itself. 


In each case, these statements need to be modified to take note of 
the fact that the origin has been deleted—we shall henceforth assume 
that this has been done. 

Notice that in case (1), which is the case of most interest, we may 
write the transformed equation as 


l @ Qa | 
See ee =U: 
G <€ 6 


This makes it clear that the image circle has centre © and radius “. 


Deduce that the image of the centre of the original circle does not go to 
the centre of the transformed circle. 

We also need the concept of inversion in a straight line. A straight 
line has an equation of the form bx — ay = c, which can be written as 


- _ —azZ + 2c 
az+az= 2c, or z = ———.. 


a 
Reflection in this line is the map 
—az+2c 


04 


Vea wea 


For example, the x-axis, y = 0, has the equation k = 27/€ (in this case 
a is purely imaginary), and reflection in it is given by x + ly. 


Find the angle between two circles by applying the cosine rule to 
the triangle formed by their centres and the relevant point of 
intersection. If the circles are n = j and (a’,c’) with radii rand 7’, 


respectively, show that the distance between their centres is |q — a’ |? 


and that the cosine of the angle between their radii is 
ad’ + &a’ —-(ct+Cc’). 
2rr 


(a) 
Deduce that the circles are perpendicular if and only if 


M(dx + Vdy) = ds, (F.1) 


(b) 
Deduce that a circle (@’, c’) is perpendicular to the unit circle if 


and only if y’ = 0. 


Consider the circles n = j and (a’,c’), which we assume intersect, 


with radii rand 7’, respectively. Show that inversion is angle- 
preserving (up to sign, so strictly speaking one should say angle 
reversing) by showing that inversion in the unit circle sends them to 
the circles (a/c, 1/c) and a,a’,..., y’, and that their radii are likewise 


transformed to r/c and r’/c’, respectively. Deduce that the angle 


between the transformed circles is the same as the angle between the 
original circles, and so inversion is angle preserving. 

Because inversion is a Mobius transformation, and Mobius 
transformations are angle-preserving, inversion in the circle n = j 


maps the circle (a’, c’) to itself if and only if the circles are at right 


angles. 

We now connect proper Mobius transformations and inversions by 
showing that every proper Mobius transformation is a product (not in 
a unique way) of two inversions. 

We already know that there is a unique Mobius transformation that 
maps the points a, b, c in that order to the points 0, 1, co in that order. 


So it is enough to find a product of two inversions that has the same 
effect. Consider the inversion that maps a to 0 and cto © given by 


fe HOV Pele - Sk 
a=(‘ Bl 1 ee) 


It maps b to x’, say. Now we consider the inversion that maps (0, b’, co) 
to u = u(7) given by z+ z/b’. The composite of these two has the 


required effect. 
Draw a picture of the map 


zb etl a. 


Hint: its fixed points are z= | and N= 1. 
Now draw a picture of the map 
Z-a@ 3{(Z-@ 
Zh Z, pear aiee eomil3 panies 
Z=p 20 
Hint: what are the two fixed points of this map? 


Can you conjugate the second map into the first by a map that sends 
a to0and nto ~? 


F.2.1 Maps of the Unit Disc to Itself 


A map of the unit disc to itself necessarily maps the unit circle to the 
unit circle, so we restrict our attention to Mobius transformations. As 
we have seen, a circle has an equation of the form z = A(Z). A Mobius 


transformation can be written as 
V*(v) = 0. 


or 7 = M-l(z’): So the equation of the image circle under this Mobius 


transformation is 
M-\(z’) = AM™|(Z’). 
So the circle is mapped to itself if 
MA"'M-! =kA 


for an arbitrary non-zero complex number k. 
Applied to the unit circle, this says, in terms of the components of A, 
that 


F (Xo, Yo: Z0; D> g) = 9 


Therefore, the Mobius transformations mapping the unit circle to itself 
are of the form 


az+b 


ZRet ; 
bz+a 


The real axis can be regarded as the circle with equation cos A., where 


Tis the identity matrix, and the same argument shows that a Mobius 
transformation maps the upper half-plane to itself if and only if its 
entries are all real and its determinant is positive. 


F2.1.1 Coaxial Circles 


Because a Mobius transformation can map any three distinct points to 
any three distinct points it can map any circle (or straight line) to any 


circle (or straight line). So a study of one-parameter family of circles 
will often apply to circles that are somehow tied to two points, as is the 
case with what are called coaxial circles. 

Coaxial circles come in two main families. The first family consists 
of all the circles (and straight lines) through two given points. We shall 
take the points to be da = O, so the circles have their centres on the y- 


axis and a is purely imaginary (ZN = OC). The condition that the 
circles pass through the points +1 forces (xo, yo). So circles in the first 
coaxial family have equations of the form 

v+y—-2by-1=0, aw=ib,c=-l. (F.2) 


The second family seems artificial at first sight. We start from the 
observation that if 


Fu, ux, Uy, X,Y) = Au, + buy +c, 
and 
S' =x +y* -2a’x-2b’y +c’ =0 
are the equations of two circles, then the equation 
AS + pS! = (A+ we? + y’) + 2(aa + pa’)x + 2(Ab + pb’yy + (Ac + pc’) = 0 


is also the equation ofa circle. 
We now regard the points da = 0 and z = | as circles of zero 


radius, and consider what circles we get by the above routine. The 
equations of the point circles, as they are called, are 


S=xt+y+2x+1=0 
and 


dz = (Vdx + dy)z, + Zdx. 


So the family of circles that we obtain has equations of the form 


AS + pS’ = (A+ w(x? + y”) + 2A-wxt+(A+p) = 0, 


which we write in the form 


A= 
Peyton y+, 
A+ pl 
or, setting 2—U ,as 
——- —@ 
A+ eg 
x+y +2ax+1=0, a =a,c =1. (F.3) 


If we now apply the rule for determining if two circles meet at right 
angles, (F.1), we find that 


ava’ + aa’ = iba —iba =O andc+c’ =0, 


so every member of the first coaxial family is perpendicular to every 
member of the second coaxial family. 

A striking version of this result is obtained by considering, as we 
may, all the circles and straight lines through the points z = | and 


N = 1. The first family of coaxial circles in this case consists of all lines 


through the origin, and the second of all the circles whose centres are at 
the origin. Because the first picture can be conjugated into the second, 
they are equivalent for the purposes of inversion. 

There is a third family of coaxial circles, which is obtained by 
starting with two coincident points, but we shall not need it. 


Appendix G 


Lipschitz and Picard 


In 1877 the German mathematician Rudolf Lipschitz published a 
paper [190], in which he observed that the question of the existence of 
solutions to a system of ordinary differential equations had been solved 


in the complex analytic case by Weierstrass [266] and Briot and 
Bouquet ([24], 49), and solutions shown to exist at least on some 
suitable domain. The method first obtained a formal power series that 
“solves” the equation, and then shows that on some domain around the 
initial point the series converges. However, there was no proof that a 
system of ordinary differential equations can be solved in the real case, 
and this was a gap he proposed to fill. 

The system of equations for functions ye y’, ...y” has the form 


dy" | 
dx 


Lipschitz assumed that the functions P; are defined and continuous 


f(ayy,...¥) (a =1,2...n). 


on some domain G where they are bounded above by some given 
quantity. Furthermore, he assumed that 


fh, ke. RK fa. P< Ol HP +t Ok - 2"), (G1) 
where the quantities ¢® are positive constants. The initial conditions 
are that when p(x, y) g = pV, and the point (x0, Yo» songy) lies (as we 
would say) in the interior of G, so that there are positive quantities 
do, bj such that ifthe point (x, y!,... y") satisfies the inequalities 
x=—-f'(p), y= xp t f(p). 


then it lies in G. 
Lipschitz was then able to show the existence of a domain H lying 
entirely in G such that there is a system of functions rg ae ... y" that 


satisfies initial conditions at xp and lies inside H for |x — xo| < Ao for 


some 7, — 7”. Lipschitz then divided the interval G = A,, — B into p 


equal pieces (the interval G = A,, — B is dealt with similarly). In each 


of these intervals he found quantities f* such that 


i=, = f° (X00 Yo- + Y9) (x}-X0) @=1,2...n 
and 
Leg ef Ot ak) ©2122. 
where adx = ydy — xdy. The inequality (G.1) shows that all these 


points lie in H. Lipschitz then proved that as a finer and finer 
subdivision is produced, the variables tend to limits that establish the 
existence of solutions to the system of ordinary differential equations. 

The novelty of Lipschitz ’s method is partly that it applies to systems 
of ordinary differential equations, and partly that inequality (G.1) is 
weaker than Cauchy’s conditions—it is, indeed, the famous Lipschitz 
condition. That said, it is clear that Lipschitz did not know of Cauchy ’s 
earlier paper, which he never mentioned. 


G.1 Picard’s Method 
Let us first take a single first-order equation® 
dy 
=. = Fy), 
A 


then, setting z, = 0 when p(x, y), one can establish the 


fundamental existence theorem for this equation. To this end, 
consider the equations 


a. F(x, yo) 
Xx 


—— = F(x, yo), 
Xx 


dyn 
dx 


effecting each quadrature’’ in such a way that for p(x, y) one 


= F(x, Vue); 


has z, = 0. The problem is to prove that, as t = y/x Vv, tends to 


a limit y which represents the desired integral provided that x 
remains in the neighborhood of x9. We assume that the function 


F(x, y) is continuous and defined for values of x and y between 
z, = 0 and z, = 0 onthe one hand and y = kt and y = kt on the 


other; moreover, that one can determine a positive constant k 
such that 


IF (x, y2) — F(x, y1) < Kly2 — yils 


we also assume that the function and the variables are real. 
Let M be the maximum modulus of F(x, y)| when x and y 
remain between the indicated limits. One will have 


x] 
y= { F(X, yo)dx + yo. 
x 


0 


Let 1 be a quantity at most equal to a: Vv, will stay within the 
desired limits if doy = O, and it is evident that the same will be 
true for M(v) = 0. Letting v denote a quantity at most equal to 
n , we will suppose that x remains between g = z, and g = Zz) 


We then have, on putting C;, C2,C3,... 


dz 


—=F ) ’ 
Tx (x, Yo) 


dz 
m = = F(x,y) — F(x, yo), 
X 


dZy 
ae = F(x, Yn-1) - EUV nD) 
X 


and all the z vanish at p(x, y). One has |z;| < Mo, |z,| < kM6?, 


y* = bx — x* and generally, 


IZnl < Mo(k6)"". 
Hence, writing 
dp = q(Pdx + Qdy + Rdz). 
one sees that v, tends to a limit if kd < 1. As a decreasing 
geometric progression, the series 
Uxy + Uyy + 2au, + 2buy +c = d; 
will be convergent. Thus v; converges to a limit y when x 
remains between g = z, and g = Z,, v being the smallest of the 


quantities a, b/M, 1/k. In this interval, y evidently represents a 
continuous function of x. Thus one also has 


x 
vn = a. F(x, Yn-1)dX + Yo, 
x 


0 


and, as vj and OF’ tend toy, it follows that 


y= | EX WANE YG; 


0 


and hence u(x, y) + Iv(x, y); that, is, the limit y satisfies the 


differential equation. Thus, the existence of the solution has 
been established. One can evidently employ the same type of 
proof if Fis an analytic function of the complex variables z and 
Ww. 


Appendix H 


The Assessment 


H.1 Introduction 


In any historical or reflective essay it’s always good to push for more 
evidence and better arguments. If you want to claim that a book is 
important because it marks a significant advance, ask yourself: What 
advance? Why was that important? How does that book do it? When 
you have answered those questions, ask the next round of questions, 
such as: What was known before? Who said it was important? If, say, 
the book displays an improved use of the calculus then what was that 
improvement? Was it a new technique? An old technique in a new 
application? And so on. 

As for evidence, quotes always help, such as, in the case of the first 
essay, Euler’s or Clairaut’s comments on how difficult the Principia is to 
read, or Euler’s remark in his Mechanica to the effect that even a slight 
change from one problem to the next can produce great difficulties. 


H.2 Assessment 1 


Set at the end of week 3, to be handed in at the end of week 4, and 
returned to the students at the end of week 5. 

Question 1 Imagine you are British Professor of mathematics in 
about the year 1770 who is recommending a good student to spend a 
year studying mathematics with either Euler or Lagrange. Explain to 
him or her: 

EITHER In what ways is Euler’s theory of mechanics an 
improvement on Newton’s Principia. 

OR What is involved in the study of partial differential equations. 


Your answer should describe what has been taken to be important 
and why is the topic you select. 

This question is to be answered in not more than 500 words, which 
is a single side of A4 in 10-point print, one and a half line spaced. 

Please do not use a smaller font size. 

Leave room for me to scribble comments. 

I will NOT turn over the page—your answer must be on one side of 
a piece of A4. 

Contact me directly or by e-mail if you cannot comply with these 
requirements. 

Advice Think hard about what is the significance of what you report 
upon. In particular, do not diminish Newton’s remarkable achievements 
in the Principia. 

When you are offering an opinion or judgement (as in “How 
important was...” or “why was....’) give a brief argument in support 
of your opinion. 

Distinguish between contemporary criticisms of someone’s work 
and your own judgements. Don’t be afraid of offering your own 
judgement: you don’t have to understand everything you've read or 
heard, but do not say anything from the perspective of the present day 
that could not have been said in 1770. Think about what was picked up, 
what was missing in the actual reception of these ideas. 

You do not have very many words, so anything you say about people 
must be essential—think of the mathematical styles of Euler and 
Lagrange, not their personalities. 

No need to mention anything of a family or personal nature— 
imagine you've written all that good stuff elsewhere in the letter. 

You may find it helpful to write notes for the essay first. Try to use 
not more than 150 words and observe the following rules: 

Each note should be one sentence long and should contain exactly 
one idea. 

The sentences should be organised in groups according to topic. 

The sentences, and the topics, should be arranged in a sensible 
order. 

When you have finished, you should be confident that you can write 
an essay of the required length in which the topics come in this order. 


H.3 Assessment 2 


EITHER Choose one of the following, and write an essay based on it that 
demonstrates some understanding of the mathematics, and situates the 
people and the ideas in a historical context. 

Spend roughly three pages describing the most important features 
of the text, and a page incorporating your analysis into an account of its 
importance when it was published. 


° Cauchy: Note on the integration of first-order partial differential 
equations in any number of variables (see the translation in Sect. 31. 
1). 

° Thomson and Stokes on the telegraphist’s equation. 


OR write an article on the hypergeometric equation, covering the 
work of Gauss, Riemann, Schwarz, and Poincaré. 

OR write an article on the works of Schwarz mentioned in the 
course. 

[The course website had pdf copies of all of these essays. | 

This question is to be answered in not more than 2200 words, 
which is four sides of A4 in 11-point print, one and a half line spaced, 
(or up to four and a half pages in TeX). 

Please do not use a smaller font size. 

A further page may be used for diagrams, and a modest number of 
extra words (I won't count them). 

Bibliographical information should also be given at the end—I won't 
count the words. 

Leave room for me to scribble comments. 

Comments on these passages will be found below. 


Advice on Choosing an Extract 

In order to choose the text you intend to work on, I suggest that you 
read all the texts quickly over once and find (at least) one you want to 
proceed with. None of them are altogether easy to read. Let the obscure 
bits wash over you and wait for something more comprehensible to 
turn up. You may find that an offending paragraph has an easier second 
part, or that it is followed by an easier one. Try to form some sense of 
what the text is about. Terminology can be unclear. List the terms you 
don’t know the meaning of and e-mail me if there’s a problem. 


Once you have chosen a text or set of texts, a good general strategy 
would be to look quickly at each of its sections and write down what is 
claimed in each of them without at this stage worrying about how 
anything was proved. This will give you a skeleton to work on and 
enable you to see the general argument. 

Grapple with as much of one of these texts as you can. If parts are 
too hard, be sure that you need to understand and describe them—you 
may or may not. You may make a modest use of footnotes to alert me 
that you did not understand something. 

The essays by Thomson”° can be found in the Digital Mathematics 
Library and the Internet Archive on the web, which send you to Gallica, 
the site of the Bibliothéque Nationale in Paris. You may find these easier 
to read than a printed copy, but the graph in Thomson’s paper is not at 
all clear. The copy here is from the original publication in the 
Proceedings of the Royal Society for 1855.7! 


Advice on writing your essay The main point of this exercise is to put 
you in the situation of a mathematician (student or professional) who 
has just studied a mathematical topic (one whose history we are 
studying in this course) and for you to show that you understand the 
mathematics and its importance. 

For some of the extracts, you may well want to say that the author’s 
reasoning is odd, even wrong; if you do so, prove your claim. 

It is better to demonstrate a real understanding of a piece of the 
mathematics than a superficial understanding of all of the extract. 

Your secondary task is to say something historical about the extract 
and its author that uses your analysis of the extract to establish its 
importance in the context of its time. 

You may stray beyond the Lecture Notes and draw on information 
accessible in a good library (for example, the Dictionary of Scientific 
Biography and standard histories of mathematics such as Kline’s) or on 
the Web (keep a critical edge), but maximum marks are available for 
information entirely drawn from the Notes. 


A comment on How the Assignment Will Be Marked 
I am looking for well-written essays, so an extra mark will be given for 
essays that are well organised and literate. So, for example, a coherent 


account that is mathematically correct and insightful but presented in 
ungrammatical prose will get one mark less than the same account in 
grammatical prose. If you want this mark, avoid English that is too 
conversational, flippant, or childish. Address to impress! 

If you want to get marks for good writing in your final essay but 
aren't sure how, let me know. I can also look (but only quickly) at 
specific requests for help, brief outlines, and the like. 


H.3.1 Cauchy 


Two things are required here. 


° I want you to grapple with Cauchy’s argument and to compare it 
with Monge’s account and with a modern account, such as the one in 
the appendix that forms Chap. C. 

. I want you to explain the crucial difference between the (quasi)- 
linear case and the general case. 

° The hardest part of the paper is seeing how Cauchy handled the 
initial conditions. He succeeds, but with unfortunate notation and 
less clarity than one would like. You may find it easier to work back 
from the modern account (see Appendix C). 

° You may find it helpful to use the following partial differential 
equation as your worked example if you give one 


F(x,), 2D») = xp’ +yq’ —2z = 0, 


with initial conditions that along the initial curve 


(x(0, s), y(O, s), z(0, s), pCO", s), q(O, s)) = (1 +s5,1- 5, Ss’, S, -s) 1/4<s< 1. 


Or you may prefer to be clear what is involved in solving this 
equation, and use your knowledge to assess the ideas of Cauchy . Do 
not attempt to reach the complete solution in parametric form ( 

z — 1/z, etc.) but indicate the integrals that would have to be 


evaluated, and assuming that they have evaluated indicate how the 
initial conditions are used. 


Ideally, you will finish up understanding the subject of first-order 
partial differential equations much better as a result. 


H.3.2 Thomson 


These are some of the actual documents that led to the creation of the 
first successful trans-Atlantic telegraph, so they are genuinely applied 
mathematics, and | thought it might be interesting for you to work out 
some of the thinking behind it. So perhaps we want a greater sense of 
the difficulties and the uncertainties involved in the work as well of the 
mathematics that was used, and an indication of what was good about 
it. 


H.3.3 The Hypergeometric Equation 


I suggest that this essay has an introduction (where you write down the 
hypergeometric equation and comment on its key properties) and a 
conclusion, and in between one page on what Gauss did and rather 
more on the contributions of Riemann, Schwarz, and Poincaré . 

For Gauss, state what he discovered about solutions of the 
hypergeometric equation in the neighbourhood of the points 
z = 0, 1, c0, and explain the distinction he drew between a (possibly 


many-valued) function and its power series expansions. 

For Riemann, explain what a P-function is locally by definition, and 
what the key properties of such functions are. Compare this with the 
usual situation for solutions of a linear second-order ordinary 
differential equation, and indicate the key steps in Riemann’s argument 
that his function is the general solution of the hypergeometric equation 
studied by Gauss. (You can read Riemann’s paper if you are comfortable 
with the idea of analytic continuation around a branch point, or willing 
to become so. I have omitted Section 5 of the paper, which dealt with a 
technical point not needed for a sufficient appreciation.) 

For Poincaré, use Schwarz’s work to explain how special cases of the 
hypergeometric equation lead to triangles with particularly simple 
properties, and then indicate how non-Euclidean geometry naturally 
enters the story (you may take the case where the branch points have 
orders 2, 3, and 7—very attractive figures for this case are on the web, 
and see also Fig. 16.3). 


H.3.4 Schwarz 


His work is strikingly coherent and this essay calls for a detailed 
analysis of both the texts and their context. Your essay should discuss 
his alternating method, explain how it is connected to the so-called 
Schwarz-Christoffel theorem, and why mapping the half-plane to a 
triangle, to a square, and to a general quadrilateral (each with vertices 
specified in advance) are problems of increasing difficulty and, if 
possible, show how to solve them. 

[I note here that students can also read Green’s “An Essay on the 
application of mathematical analysis to the theories of electricity and 
magnetism’, which is reprinted in his Mathematical Papers (pp. 356- 
374), consult the Digital Mathematics Library and the Internet Archive 
on the web. There are two proofs (Sects. 4 and 5) where Green slips 
between mathematics and physics; it is good to bring them out, to 
make sense of Sect. 6, and then to compare his paper with the ideas of, 
for example, Gauss or Dirichlet.] 


H.4 Assessment 3 


The history of partial differential equations in the nineteenth century 
belongs to applied mathematics, not pure mathematics. To what extent 
do you agree, and why? 


H.4.1 Advice 


Claims like this can have several different kinds of answers. Clearly it 
depends on what is meant by such terms as pure and applied 
mathematics, about which there is legitimate disagreement. 

You might reply, for example, that the claim 


° is clearly true. 

° is clearly false. 

° is true in certain respects but not in others, say because the story 
is better seen as some mix of pure and applied. 

° is only true because what happened is some mix of pure and 


applied mathematics, but it’s better seen as simply mathematics 
underneath. 

° is silly, perhaps because you don’t accept the terms of the debate, 
and find the terms pure and applied are superficial (they may 
mislead) and it’s all simply mathematics. 


Therefore, it is essential to have an opinion, to state it clearly at the 
start, to argue for it, to address such counter-arguments as seem to you 
to have merit, and to reach a substantial conclusion. 

To organise your thoughts, you should think about the aims, 
methods, and results of several mathematicians of the period, and 
about how they might have answered the question. Think also about 
the significant theoretical advances made in the study of partial 
differential equations in the nineteenth century, as well as the 
significant problems that were solved and applications that were made. 

Your essay should respond to the work of most of the leading 
mathematicians mentioned in the course. The nineteenth century is 
here defined to cover the period discussed from Lecture 10 to the end 
of the course. 

When you are ready to start writing, structure your essay so that it 
is easy to appreciate. 


° The Introduction should be very brief, but indicate the key points 
that you will make in your essay. In particular, state the opinion that 
you are going to argue for. 

° You should decide what you mean by such terms as pure 
mathematics, applied mathematics, geometry, and even physics. You 
may find it helpful to define what you mean by these terms early in 
your esSay. 

° The body of your essay should be rich but clear. 

° Every paragraph must support your argument, even the ones 
where you are showing why your evidence doesn’t support a 
different opinion. 

° Be prepared to defend a complicated position (the first and 
second halves of the nineteenth century were different, for example, 
if you think they were) or an extreme position if you have one. 

° Your answer is a measure of the extent of your agreement or 
disagreement, it is not a string of facts about partial differential 
equations. 

° Facts about partial differential equations are grist to your 
agreement or disagreement and must be presented clearly in their 
own right and as supporting that agreement or disagreement. 

° If you are arguing for a complicated position, it might be a good 
idea to give a paragraph to each single aspect of it. 


° If you are arguing for a simple position (such as “clearly true”, 
“clearly false”, or “silly”) be sure you explain why and offer rebuttals 
of opposing positions. 

° The Conclusion should restate the position of the Introduction, 
but in a way that recalls subtleties and complications acknowledged 
in the body of the essay. 


You have to decide what is right for you, state that position, and 
defend it like a lawyer. Channel your favourite court room drama, call 
your expert witnesses. Give clear references to all the sources you use, 
so that they can be checked; for web-based sources give me the URL. 

Full marks are available for essays that draw only on course 
material, but you are welcome to move beyond. 


H.4.2 How the Essays Will Be Marked 


To obtain a mark appropriate to a First: 

Quality of argument: The argument is convincing, supported with 
relevant facts, and well organised. 

The judgements reached are, when necessary, subtle, and balanced. 

The coverage is broad—there are no potentially damaging 
omissions. 

There are no unnecessary digressions. 

Ideally, and without being cranky, the essay is original (in its 
emphases, or its conclusions). 

Good use of quotations. 

Accuracy: The historical facts are indeed correct and to the point. 

The mathematics is correct, clear, and relevant. 

The written English, as a piece of prose, is well written, and of the 
right length. 

And remember that extra mark for essays that are genuinely well 
written. 


Upper Second: 
Falls below the above in one or two significant ways. 


Lower Second: 
Falls below what is required for an Upper Second in one or two 
significant ways. An unconvincing or desultory argument. 


Third: 
Shows a bare knowledge of the topic, but is poorly organised and/or at 
times inaccurate. 


Fail: 
Does not demonstrate a knowledge of the topic. 


Borderline distinctions It can be hard to tell a First from a good Upper 
Second. Roughly speaking, a First-class essay is something you (and I!) 
should be proud of and you (or I) could put with confidence in front of 
anyone, whereas varieties of Second go to good, and even very good, 
work. On any borderline, a good original point can push you up, a 
significant error can push you down. At the other end, a Third says “Yes, 
you know some things you didn’t know before but only just enough” 
and a Lower Second says that you are either generally, if intangibly, 
better than that, or at some identifiable point clearly better than that. A 
Fail mark goes to an essay that doesn’t do more than recycle facts but 
lacks coherence or an argument, or fails to address the question of 
importance. 

Plagiarism The taking of information and arguments from sources 
you do not acknowledge is theft. It will result in a Fail. 
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Footnotes 


1 See Principia Axioms, or the laws of motion in the Cohen and Whitman translation of 1999, 
pp. 416-417, and (in the Motte-Cajori translation) pp. xvii-xix and F&G 12. B2. 


2 See Cohen ([46], 229). 


3 See Newton, Principia, Book II, 790. 


4 See Newton Principia I, Sect. 9, 534-545. 


5 See the Cohen and Whitman translation, p. 545. 


6 For this exchange of letters, see Euler Opera Omnia (4A) 5, 173-175 and F&G 14.B2 and 
14.B4. 


7 See Wilson [274] for a good account, and for references to the primary literature. 


8 See Euler Opera Omnia (4) 5, letter 421 and F&G 14.B2(b). 


9 See Clairaut [45] and the extract in F&G 14.B3. 


10 See Kopelevich [163]. 


11 Clairaut then published his own theory of the motion of the Moon in 1753. D’Alembert now 
also withdrew his criticisms of the inverse-square law. 


12 See Euler Opera Omnia (4) V, 195-196 and F&G 14.B4. 


13 See Grigoryan’s account: http://www.math.ucsb.edu/~grigoryan/124A.pdf or google 
Grigoryan partial differential equations. 


14 See, for example, Grigoryan’s account. 


15 See Sommerfeld ([248], 66). 


16 This is because a good way of solving many a problem involving analytic functions is the 
method of undetermined coefficients. 


17 Many other things can happen, but that is not our subject. 


18 This is a lightly corrected version of the translation in the Birkhoff Source Book, 250-251. 


19 Picard here chooses the constants of integration. 


20 Enter http://gallica.bnf.fr/ark:/12148/bpt6k95119q for Thomson. 


21 Enter http://rspl.royalsocietypublishing.org/content/7 /382.full.pdf+html. 


